The study investigates the biases in large language models (LLMs) when handling subjective natural language processing (NLP) tasks. Using the POPQUORN dataset, the study tests LLMs on their understanding of gender and ethnicity differences in two subjective NLP tasks: politeness and offensiveness. The study finds that LLMs’ predictions are closer to the perceptions of females and White people. Adding demographic information to the prompts worsens the model’s performance. The findings suggest that demographic-infused prompts alone may not mitigate these biases.

 

Publication date: 17 Nov 2023
Project Page: https://github.com/Jiaxin-Pei/LLM-Group-Bias
Paper: https://arxiv.org/pdf/2311.09730