The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks

The paper “The Clock and the Pizza: Two Stories in Mechanistic Explanation of Neural Networks” dives into the fascinating subject of how neural networks, when trained on algorithmic tasks that are well-understood, might reliably rediscover known algorithms for solving those tasks. The authors use modular addition as a representative problem, demonstrating that the discovery of algorithms in neural networks can sometimes be more intricate than anticipated. They discover that minor alterations to model hyperparameters and initializations can lead to the discovery of qualitatively different algorithms from a fixed training set. The paper introduces two such algorithms, the Clock algorithm and the Pizza algorithm, and discusses the implications of these findings for understanding the behavior of neural networks.

 

Publication date: June 30, 2023
Project Page: N/A
Paper: https://arxiv.org/pdf/2306.17844.pdf