TaskBench: Benchmarking Large Language Models for Task Automation
The article introduces TaskBench, a benchmark for evaluating the capabilities of large language models (LLMs) in task automation. Task automation, which decomposes complex tasks into sub-tasks and invokes external tools…
Continue reading