Verb Conjugation in Transformers Is Determined by Linear Encodings of Subject Number

The research investigates how the deep learning model, BERT, conjugates verbs. It shows that BERT’s ability to conjugate verbs is determined by a linear encoding of subject number. This encoding is found in the subject position at the first layer and the verb position at the last layer. When manipulated, it has predictable effects on conjugation accuracy. The findings contradict the view of deep architectures like Transformers as ‘black boxes’ and demonstrate that some linguistic features are represented in a linear, interpretable format.

Publication date: 24 Oct 2023
Project Page: https://…
Paper: https://arxiv.org/pdf/2310.15151

Post Views: 319

Barabasi-Albert graph, deep architectures, linear encoding, subject number, verb conjugation

Verb Conjugation in Transformers Is Determined by Linear Encodings of Subject Number

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Linear Representations of Sentiment in Large Language Models

Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning

Leave a Reply Cancel reply

Please allow ads on our site