The article discusses HELPER, an embodied agent that uses retrieval-augmented situated prompting of Large Language Models (LLMs) to parse free-form dialogue, instructions, and corrections from humans and vision-language models to program actions. The memory of HELPER expands during deployment to include pairs of user’s language and action plans, personalizing future inferences to the user’s language and routines. This system sets a new standard in the TEACh benchmark in both Execution from Dialog History (EDH) and Trajectory from Dialogue (TfD), with a 1.7x improvement over the previous state of the art for TfD.
Publication date: 24 Oct 2023
Project Page: helper-agent-llm.github.io
Paper: https://arxiv.org/pdf/2310.15127