The article discusses the potential of Large Language Models (LLMs) in understanding and mimicking human behaviour in a Human-Robot Interaction scenario. It examines whether LLMs can effectively act as a human proxy in assessing a robot’s behaviour. The study particularly focuses on four behaviour types – explicable, legible, predictable, and obfuscatory – that are crucial for synthesizing interpretable robot behaviours. The authors conducted a series of tests and found that while LLMs showed promise in certain areas, they lacked the ability to consistently respond to trivial or irrelevant perturbations in the context. The results were reported on GPT-4 and GPT-3.5-turbo models.

 

Publication date: 12 Jan 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2401.05302