The research investigates how the deep learning model, BERT, conjugates verbs. It shows that BERT’s ability to conjugate verbs is determined by a linear encoding of subject number. This encoding is found in the subject position at the first layer and the verb position at the last layer. When manipulated, it has predictable effects on conjugation accuracy. The findings contradict the view of deep architectures like Transformers as ‘black boxes’ and demonstrate that some linguistic features are represented in a linear, interpretable format.
Publication date: 24 Oct 2023
Project Page: https://…
Paper: https://arxiv.org/pdf/2310.15151