The study explores the effectiveness of training neural language models on developmentally plausible data like BabyLM Challenge to align with human reading behavior. The models were trained on the BabyLM strict-small dataset, and the performance was evaluated based on the model’s linguistic abilities and ability to capture aspects of human language processing. The study found that while the models trained on the BabyLM data curriculum performed slightly better, the improved linguistic knowledge acquisition did not result in better alignment with human reading behavior. The findings suggest that training on developmentally plausible datasets alone is likely insufficient to generate language models capable of accurately predicting human language processing.

 

Publication date: 1 Dec 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2311.18761