This research article discusses the impact of model editing on Large Language Models (LLMs). The authors reveal that even a single edit can lead to a significant degradation in performance, a phenomenon termed as ‘model collapse’. To prevent such collapses, the authors suggest using perplexity as a surrogate metric. The study also delves into sequential editing across various editing methods and LLMs. It’s observed that nearly all editing methods result in model collapse after a few edits. The authors have developed a new dataset, HardCF, using ChatGPT, based on these hard cases for future research.
Publication date: 16 Feb 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2402.09656