T-Eval: Evaluating the Tool Utilization Capability Step by Step
This research paper presents T-Eval, a new method for evaluating the capabilities of Large Language Models (LLMs) in tool utilization. Unlike previous benchmarks, T-Eval decomposes the evaluation into multiple sub-tasks,…
Continue reading