Bradley Terry probe

What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations

root December 2, 2023 0

Despite declining to respond to controversial prompts, Large Language Models (LLMs) may still exhibit sociodemographic biases in their latent representations. This study proposes a logistic Bradley Terry probe to detect…

Computation and Language

What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations

root December 1, 2023 0

This study investigates if large language models (LLMs) exhibit sociodemographic biases, even when they refuse to respond to sensitive prompts. Researchers explored this by probing contextualized embeddings to see if…

Page 1 of 1

Press ESC to close

Bradley Terry probe

Please allow ads on our site