Physics of Language Models: Part 3.1, Knowledge Storage and Extraction

The study investigates the capacity of large language models to store and extract knowledge. It questions whether the model’s responses are based on exposure to similar questions during training or if it genuinely extracts knowledge from the source. The research uses a controlled set of semi-synthetic biography data to analyze this issue, revealing a correlation between the model’s knowledge extraction ability and the diversity measures of the training data. The findings suggest that memorizing all sentences in the training data doesn’t ensure that the model can extract or manipulate the factual knowledge from the sentences during inference.

Publication date: 26 Sep 2023
Project Page: https://arxiv.org/abs/2309.14316v1
Paper: https://arxiv.org/pdf/2309.14316

Post Views: 380

Physics of Language Models: Part 3.1, Knowledge Storage and Extraction

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Innovative Digital Storytelling with AIGC: Exploration and Discussion of Recent Advances

OmniEvent: A Comprehensive, Fair, and Easy-to-Use Toolkit for Event Understanding

Leave a Reply Cancel reply

Please allow ads on our site