CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation
The study presents a new model, CRITIQUE LLM, designed to evaluate the effectiveness of large language models (LLMs) such as GPT-4. Traditional evaluation metrics have shown limited effectiveness, and thus…
Continue reading