The paper presents PAC-tuning, a method for fine-tuning pretrained language models (PLMs) to enhance their generalization performance. The method uses PAC-Bayes training to directly minimize the PAC-Bayes generalization bound, resulting in an improved parameter distribution. The second stage modifies the gradient by injecting noise into the model parameters during training, forming a variant of perturbed gradient descent. The method has been tested on 5 GLUE benchmark tasks and has shown to outperform baseline methods, indicating potential for wider applications where the Adam optimizer is typically used for training.

 

Publication date: 27 Oct 2023
Project Page: https://github.com/MSU-NLP-CSS/PAC-tuning
Paper: https://arxiv.org/pdf/2310.17588