December 2, 2023

Mavericks at NADI 2023 Shared Task: Unravelling Regional Nuances through Dialect Identification using Transformer-based Approach

The study presents a methodology for the ‘Nuanced Arabic Dialect Identification (NADI) Shared Task 2023’. It focuses on country-level dialect identification, which is crucial for various Natural Language Processing (NLP) tasks like speech recognition and translation. The authors use the Twitter dataset (TWT-2023) that includes 18 dialects for the multiclass classification problem. They employ various transformer-based models, pre-trained on Arabic language, to identify these dialects. The models are fine-tuned on the provided dataset and an ensembling method is used to improve system performance. The approach achieved an F1-score of 76.65.

Publication date: 1 Dec 2023
Project Page: unavailable
Paper: https://arxiv.org/pdf/2311.18739

Post Views: 276

Arabic Dialect Identification, Financial NLP, Multiclass Classification, Transformer Models, Twitter data

Mavericks at NADI 2023 Shared Task: Unravelling Regional Nuances through Dialect Identification using Transformer-based Approach

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Can training neural language models on a curriculum with developmentally plausible data improve alignment with human reading behavior?

Mavericks at ArAIEval Shared Task: Towards a Safer Digital Space — Transformer Ensemble Models Tackling Deception and Persuasion

Leave a Reply Cancel reply

Please allow ads on our site