The article presents a new technique called ‘Emulated Fine-Tuning’ (EFT) for large language models. The authors explore the two-stage training pipeline of language models – pre-training and fine-tuning – and propose a technique to decouple the knowledge and skills gained in these stages. They use an RL-based framework to introduce EFT, a method for sampling from a distribution that emulates the result of pre-training and fine-tuning at different scales. The experiments with EFT show that scaling up fine-tuning improves helpfulness, while scaling up pre-training improves factuality. EFT also allows test-time adjustment of behavioral traits without additional training. A special case of EFT, ‘LM up-scaling’, avoids resource-intensive fine-tuning of large pre-trained models by combining them with small fine-tuned models, thus improving helpfulness and factuality of instruction-following models.

 

Publication date: 20 Oct 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2310.12962