The Reversal Curse: LLMs trained on A is B fail to learn B is A

The study uncovers a surprising failure of logical deduction in large language models (LLMs). This failure, termed as the ‘Reversal Curse’, means that if a model is trained on a statement of the form ‘A is B’, it does not automatically deduce ‘B is A’. The study provides evidence of this phenomenon by testing models like GPT-3 and Llama-1 on fictitious and real-world statements, revealing a consistent failure to answer reverse queries accurately. This failure is not alleviated by data augmentation and is consistent across different model sizes and types.

Publication date: 21 Sep 2023
Project Page: https://github.com/lukasberglund/reversal_curse
Paper: https://arxiv.org/pdf/2309.12288

Post Views: 288