The Reversal Curse: LLMs trained on "A is B" fail to learn "B is A"
Summary
The article discusses a surprising generalization failure in autoregressive LLMs: models trained on sentences of the form 'A is B' do not automatically generalize to 'B is A'. It notes in-context cues can help deduce reverse relations and provides experimental evidence across model sizes, including GPT-3, GPT-4, and open models, highlighting implications for data design and evaluation.