GeoEval: Benchmark for Evaluating LLMs and Multi-Modal Models on Geometry Problem-Solving
The article introduces GeoEval, a comprehensive collection of geometry math problems designed to evaluate the proficiency of Large Language Models (LLMs) and Multi-Modal Models (MMs) in problem-solving. The benchmark includes…
Continue reading