EkoHate: Abusive Language and Hate Speech Detection for Code-switched Political Discussions on Nigerian Twitter
Proceedings of the 8th Workshop on Online Abuse and Harms (WOAH 2024)(2024)
摘要
Nigerians have a notable online presence and actively discuss political andtopical matters. This was particularly evident throughout the 2023 generalelection, where Twitter was used for campaigning, fact-checking andverification, and even positive and negative discourse. However, little or nonehas been done in the detection of abusive language and hate speech in Nigeria.In this paper, we curated code-switched Twitter data directed at threemusketeers of the governorship election on the most populous and economicallyvibrant state in Nigeria; Lagos state, with the view to detect offensive speechin political discussions. We developed EkoHate – an abusive language and hatespeech dataset for political discussions between the three candidates and theirfollowers using a binary (normal vs offensive) and fine-grained four-labelannotation scheme. We analysed our dataset and provided an empirical evaluationof state-of-the-art methods across both supervised and cross-lingual transferlearning settings. In the supervised setting, our evaluation results in bothbinary and four-label annotation schemes show that we can achieve 95.1 and 70.3F1 points respectively. Furthermore, we show that our dataset adequatelytransfers very well to three publicly available offensive datasets (OLID,HateUS2020, and FountaHate), generalizing to political discussions in otherregions like the US.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要