This academic article introduces a new algorithm for efficiently solving the damped Fisher matrix in large-scale scenarios where the number of parameters significantly exceeds the number of available samples. This problem is crucial for natural gradient descent and stochastic reconfiguration. The proposed algorithm, based on Cholesky decomposition, is faster and more efficient than existing methods. It is designed for GPU implementation and can be easily parallelized, promising to significantly improve the scalability and performance of natural gradient descent and stochastic reconfiguration.
Publication date: 27 Oct 2023
Project Page: https://arxiv.org/abs/2310.17556
Paper: https://arxiv.org/pdf/2310.17556