The paper, titled ‘Cordyceps@LT-EDI: Patching Language-Specific Homophobia/Transphobia Classifiers with a Multilingual Understanding’, presents a novel approach to detecting homophobia and transphobia in different languages. The approach combines multilingual (M-L) and language-specific (L-S) models to identify hate speech. The M-L models are used to catch less common words, phrases, and concepts in various languages, while the L-S models understand the cultural and linguistic context of each language. The paper demonstrates that this combined approach achieves superior results in detecting hate speech in three out of five languages tested, with an exceptional performance on Malayalam texts.

 

Publication date: 24 Sep 2023
Project Page: https://arxiv.org/abs/2309.13561v1
Paper: https://arxiv.org/pdf/2309.13561