Learning to Learn Faster from Human Feedback with Language Model Predictive Control

This academic article explores how to enhance the teachability of large language models (LLMs) used for writing robot codes. These LLMs allow non-experts to direct robot behaviours, modify them based on feedback, or compose them to perform new tasks. However, their capabilities are limited to short-term interactions, as they can forget user feedback over longer interactions. The paper proposes fine-tuning these LLMs to remember their in-context interactions and improve their adaptability to human inputs. The authors introduce a framework called Language Model Predictive Control (LMPC) that fine-tunes the models to improve their teachability across various tasks and robot embodiments, resulting in a significant improvement in non-expert teaching success rates and a reduction in the average number of human corrections.

Publication date: 20 Feb 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2402.11450

Post Views: 262

Learning to Learn Faster from Human Feedback with Language Model Predictive Control

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Point-Wise Vibration Pattern Production via a Sparse Actuator Array for Surface Tactile Feedback

Symbolic Music Generation with Non-Differentiable Rule Guided Diffusion

Leave a Reply Cancel reply

Please allow ads on our site