Efficient Numerical Algorithm for Large-Scale Damped Natural Gradient Descent

This academic article introduces a new algorithm for efficiently solving the damped Fisher matrix in large-scale scenarios where the number of parameters significantly exceeds the number of available samples. This problem is crucial for natural gradient descent and stochastic reconfiguration. The proposed algorithm, based on Cholesky decomposition, is faster and more efficient than existing methods. It is designed for GPU implementation and can be easily parallelized, promising to significantly improve the scalability and performance of natural gradient descent and stochastic reconfiguration.

Publication date: 27 Oct 2023
Project Page: https://arxiv.org/abs/2310.17556
Paper: https://arxiv.org/pdf/2310.17556

Post Views: 244

Efficient Numerical Algorithm for Large-Scale Damped Natural Gradient Descent

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Bifurcations and loss jumps in RNN training

Human-Guided Complexity-Controlled Abstractions

Leave a Reply Cancel reply

Please allow ads on our site