The Butterfly Effect of Model Editing: Few Edits Can Trigger Large Language Models Collapse

This research article discusses the impact of model editing on Large Language Models (LLMs). The authors reveal that even a single edit can lead to a significant degradation in performance, a phenomenon termed as ‘model collapse’. To prevent such collapses, the authors suggest using perplexity as a surrogate metric. The study also delves into sequential editing across various editing methods and LLMs. It’s observed that nearly all editing methods result in model collapse after a few edits. The authors have developed a new dataset, HardCF, using ChatGPT, based on these hard cases for future research.

Publication date: 16 Feb 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2402.09656

Post Views: 300

Press ESC to close

Share Article:

root

User Modeling and User Profiling: A Comprehensive Survey

GPT-4’s assessment of its performance in a USMLE-based case study

Please allow ads on our site