The article presents GateLoop, a sequence model that enhances linear recurrent models such as S4, S5, LRU, and RetNet by using data-controlled state transitions. GateLoop surpasses existing models in auto-regressive language modeling, with low-cost O(l) recurrent mode and an efficient O(llog2l) parallel mode. It also introduces a new O(l2) surrogate attention mode with significant implications for Transformer and other architectures. The model provides data-controlled relative-positional information to Attention, suggesting that incorporating data-controlled complex cumulative products could be crucial for more powerful sequence models.

 

Publication date: 3 Nov 2023
Project Page: https://arxiv.org/abs/2311.01927v1
Paper: https://arxiv.org/pdf/2311.01927