Linear Representations of Sentiment in Large Language Models

This study investigates the representation of sentiment in Large Language Models (LLMs). It reveals that sentiment is represented linearly in a single direction within these models, playing a vital role in tasks ranging from toy tasks to real-world datasets such as the Stanford Sentiment Treebank. The researchers also highlight the roles of a small subset of attention heads and neurons. Additionally, the study uncovers a phenomenon termed the ‘summarization motif’, where sentiment is summarized at intermediate positions without inherent sentiment, such as punctuation and names. The study demonstrates that a significant portion of classification accuracy is lost when the sentiment direction is ablated.

Publication date: 23 Oct 2023
Project Page: https://arxiv.org/abs/2310.15154v1
Paper: https://arxiv.org/pdf/2310.15154

Post Views: 312

Linear Representations of Sentiment in Large Language Models

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers

Verb Conjugation in Transformers Is Determined by Linear Encodings of Subject Number

Leave a Reply Cancel reply

Please allow ads on our site