Bifurcations and loss jumps in RNN training

The study explores the role of bifurcations in recurrent neural networks (RNNs) training. Bifurcations refer to changes in a system’s dynamical behavior as its parameters are varied. The authors mathematically prove that certain bifurcations in ReLU-based RNNs are associated with loss gradients tending towards infinity or zero. They introduce a novel heuristic algorithm for detecting all fixed points and cycles in ReLU-based RNNs. This algorithm provides exact results and returns fixed points and cycles up to high orders with good scaling behavior. The study reveals that the technique of generalized teacher forcing avoids certain types of bifurcations in training.

Publication date: 26 Oct 2023
Project Page: https://arxiv.org/abs/2310.17561v1
Paper: https://arxiv.org/pdf/2310.17561

Post Views: 281

Bifurcations and loss jumps in RNN training

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

BLIS-Net: Classifying and Analyzing Signals on Graphs

Efficient Numerical Algorithm for Large-Scale Damped Natural Gradient Descent

Leave a Reply Cancel reply

Please allow ads on our site