preference biases Papers

What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations

root December 1, 2023 0

This study investigates if large language models (LLMs) exhibit sociodemographic biases, even when they refuse to respond to sensitive prompts. Researchers explored this by probing contextualized embeddings to see if…

Press ESC to close

preference biases

Please allow ads on our site