This academic article focuses on Positional Encodings (PEs) in transformer-based language models, particularly BERT-style models. The authors conduct a systematic study to understand the function of PEs in these models, identifying two core properties: Locality and Symmetry. The study shows that these properties correlate with the performance of downstream tasks. The authors also introduce two new tasks to measure the weakness of current PEs. Findings from the study suggest that models initialized with PEs that demonstrate good locality and symmetry can improve performance across various tasks. The research provides a basis for developing improved PEs for transformer-based language models.
Publication date: 19 Oct 2023
Project Page: https://github.com/tigerchen52/locality_symmetry
Paper: https://arxiv.org/pdf/2310.12864