This research paper introduces a new framework named O3D (Offline Data-driven Discovery and Distillation) designed to enhance the performance of Large Language Models (LLMs) in sequential decision-making tasks. The O3D framework utilizes offline data, such as logs of human interactions, to improve the in-context learning performance of LLM agents. This approach allows LLM agents to generate high-quality solutions for complex tasks by discovering reusable skills and distilling generalizable knowledge from offline interaction data. The paper demonstrates that the O3D framework can significantly enhance the decision-making capabilities of LLMs, outperforming existing methods in both text-based and code-based policy tasks.

 

Publication date: 22 Oct 2023
Project Page: https://arxiv.org/abs/2310.14403v1
Paper: https://arxiv.org/pdf/2310.14403