The study explores the resilience of Large Language Models (LLMs), particularly GPT-4, to extensive character-level permutations. The researchers propose the Scrambled Bench, a suite designed to measure the capacity of LLMs to handle scrambled input. The results indicate that GPT-4 can almost perfectly reconstruct the original sentences from scrambled ones, decreasing the edit distance by 95%. This ability to process inputs with unnatural errors is surprising and counter-intuitive, given the severe disruption to input tokenization caused by scrambled text.

 

Publication date: 1 Dec 2023
Project Page: https://github.com/ccqq77/unnatural-error-correction
Paper: https://arxiv.org/pdf/2311.18805