Limitations

Before presenting the conclusion, it is important to outline that some limitations are encountered during the research process. Firstly, the sample of 500 posts was sourced exclusively from Gab and Twitter. Certainly, Gab and, increasingly, Twitter are becoming social media platforms known for a self-selected community of users who have been banned or marginalized from mainstream platforms. The findings, therefore, must be interpreted within the context of these platforms and not generalised to other social media platforms. The same applies to geographical regions, as the posts were in English and referenced primarily US and occasionally British contexts.

Among the 500 posts, 20 were excluded due to missing labels from one or both of our group annotators. It was collectively decided that, in cases where a meaningful classification could not be reached, the post would be excluded rather than assigned an uncertain label. This approach carries an analytical cost: the posts that resisted classification are likely those where the boundaries between categories are most unstable. Furthermore, this is an option that the MTurks did not have, for them, a classification of every post was necessary.

A further limitation is that regular meetings and text conversations regarding the labelling were held online. Rather than annotating the posts entirely on their own, the group discussed some general hypothetical cases. These exchanges may have inflated inter-annotator agreement in borderline cases by producing convergence through discussion rather than independent judgment. Additionally, the internal validity of our findings may have been impacted by the origin of our working criteria. A single human annotator was responsible for drafting the basic definition of hate speech that the group subsequently adopted for the coding process. While this ensured a somewhat unified baseline for the consensus among humans, it also presents the risk of one perspective influencing the qualitative output.

Lastly and most importantly, the conditions under which the original MTurk annotators worked remain opaque. HateXplain does not provide any information about them, the exact time frame during which the annotations took place, whether they received definitions of offensive and hate speech, and, if so, what those definitions were. Most likely, the annotations occurred in 2021, when a heated political climate could have influenced the MTurks further, as the change of administration and the storming of the Capitol were followed by intense political debates both on and offline. All these circumstances made it difficult to replicate the exact conditions and limit the effects that have been identified between the two groups.