PAC-tuning:Fine-tuning Pretrained Language Models with PAC-driven Perturbed Gradient Descent

The paper presents PAC-tuning, a method for fine-tuning pretrained language models (PLMs) to enhance their generalization performance. The method uses PAC-Bayes training to directly minimize the PAC-Bayes generalization bound, resulting in an improved parameter distribution. The second stage modifies the gradient by injecting noise into the model parameters during training, forming a variant of perturbed gradient descent. The method has been tested on 5 GLUE benchmark tasks and has shown to outperform baseline methods, indicating potential for wider applications where the Adam optimizer is typically used for training.

Publication date: 27 Oct 2023
Project Page: https://github.com/MSU-NLP-CSS/PAC-tuning
Paper: https://arxiv.org/pdf/2310.17588

Post Views: 274

Press ESC to close

Share Article:

root

Uncovering Meanings of Embeddings via Partial Orthogonality

BLIS-Net: Classifying and Analyzing Signals on Graphs

Please allow ads on our site