The research paper explores the notion of ‘easy-to-hard generalization’ in language models. It addresses the problem of training models on hard data which is often difficult to label correctly. The authors present the surprising conclusion that language models often generalize relatively well from easy to hard data, performing at par with models trained on hard data. The study demonstrates this generalization using simple training methods across seven different measures of data hardness. The findings imply that even if one cares most about model performance on hard data, it can be more beneficial to collect and train on easy data, as hard data is generally noisier and costlier to collect. The study concludes that easy-to-hard generalization in language models is surprisingly strong and the scalable oversight problem may be easier than previously thought.

 

Publication date: 12 Jan 2024
Project Page: https://github.com/allenai/easy-to-hard-generalization
Paper: https://arxiv.org/pdf/2401.06751