This research investigates the effect of instruction-tuning on the similarity between large language models (LLMs) and human language processing. The study examines brain alignment (the similarity of LLM internal representations to neural activity in the human language system) and behavioral alignment (the similarity of LLM and human behavior on a reading task). The results suggest that instruction-tuning generally enhances brain alignment by an average of 6%, but does not have a similar effect on behavioral alignment. There is a strong positive correlation between brain alignment and model size, as well as performance on tasks requiring world knowledge. This indicates that mechanisms that encode world knowledge in LLMs also improve representational alignment to the human brain.
Publication date: 1 Dec 2023
Project Page: Not Provided
Paper: https://arxiv.org/pdf/2312.00575