High-dimensional SGD aligns with emerging outlier eigenspaces

The article investigates the joint evolution of training dynamics via Stochastic Gradient Descent (SGD) and the spectra of empirical Hessian and gradient matrices. The authors demonstrate that in two canonical classification tasks for multi-class high-dimensional mixtures and either 1 or 2-layer neural networks, the SGD trajectory rapidly aligns with emerging low-rank outlier eigenspaces of the Hessian and gradient matrices. This alignment occurs per layer in multi-layer settings, with the final layer’s outlier eigenspace evolving over the course of training. The study contributes to understanding the spectra of Hessian and information matrices over the course of training in overparametrized networks.

Publication date: 4 Oct 2023
Project Page: https://arxiv.org/abs/2310.03010
Paper: https://arxiv.org/pdf/2310.03010

Post Views: 358

High-dimensional SGD aligns with emerging outlier eigenspaces

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

SemiReward: A General Reward Model for Semi-supervised Learning

Soft Convex Quantization: Revisiting Vector Quantization with Convex Optimization

Leave a Reply Cancel reply

Please allow ads on our site