The study explores the role of bifurcations in recurrent neural networks (RNNs) training. Bifurcations refer to changes in a system’s dynamical behavior as its parameters are varied. The authors mathematically prove that certain bifurcations in ReLU-based RNNs are associated with loss gradients tending towards infinity or zero. They introduce a novel heuristic algorithm for detecting all fixed points and cycles in ReLU-based RNNs. This algorithm provides exact results and returns fixed points and cycles up to high orders with good scaling behavior. The study reveals that the technique of generalized teacher forcing avoids certain types of bifurcations in training.
Publication date: 26 Oct 2023
Project Page: https://arxiv.org/abs/2310.17561v1
Paper: https://arxiv.org/pdf/2310.17561