The paper introduces TIMEARENA, a textual simulated environment that incorporates complex temporal dynamics and constraints to mimic real-life planning scenarios. This environment allows agents to complete multiple tasks as quickly as possible, allowing for parallel processing to save time. TIMEARENA is grounded in 30 real-world tasks in cooking, household activities, and laboratory work. The study reveals that even the most powerful models, like GPT-4, still lag behind humans in effective multitasking, underlining the need for enhanced temporal awareness in the development of language agents.

 

Publication date: 8 Feb 2024
Project Page: https://time-arena.github.io
Paper: https://arxiv.org/pdf/2402.05733