The article introduces the sparse modern Hopfield model, an extension of the modern Hopfield model equipped with a memory-retrieval dynamics corresponding to the sparse attention mechanism. The authors provide a derivation of a closed-form sparse Hopfield energy using the convex conjugate of the sparse entropic regularizer and show that the sparse model outperforms its dense counterpart in many situations. They also discuss the conditions necessary for the benefits of sparsity to arise and demonstrate that the sparse modern Hopfield model maintains the robust theoretical properties of its dense counterpart, including rapid fixed point convergence and exponential memory capacity.

 

Publication date: 25 Sep 2023
Project Page: https://arxiv.org/abs/2309.12673
Paper: https://arxiv.org/pdf/2309.12673