The authors introduce a novel approach to hyper-parameter optimization in machine learning by focusing on the strong convexity of the loss and its flatness. They aim to find hyper-parameter configurations that enhance flatness by minimizing the strong convexity of the loss. By leveraging the structure of the underlying neural network, they derive equations to approximate the strong convexity parameter and seek to minimize it. The proposed method has demonstrated strong performance on 14 classification datasets, with significantly reduced runtime.

 

Publication date: 7 Feb 2024
Project Page: https://arxiv.org/abs/2402.05025v1
Paper: https://arxiv.org/pdf/2402.05025