The study, conducted by a team from the Indian Institute of Technology and National Institute of Technology, focused on using BERT and LightGBM models for binary classification of self-reported COVID-19 and social anxiety disorder diagnoses on social media. The team participated in the SMM4H shared tasks, achieving the highest f1-score of 0.94 in task 1. The tasks involved distinguishing self-reported diagnoses from non-diagnostic mentions, enabling large-scale analysis. The paper also discusses the benefits of LightGBM, including its rapid training speed, high accuracy, and minimal memory usage.

 

Publication date: 4 Jan 2024
Project Page: https://arxiv.org/abs/2401.02158
Paper: https://arxiv.org/pdf/2401.02158