The study investigates whether incorporating human preferences in Reinforcement Learning (RL) can enhance the transparency of robot behaviours. A shielding mechanism is integrated into the RL algorithm to monitor the learning agent’s decisions and include human preferences. This study involved 26 participants to assess the robot’s transparency in terms of Legibility, Predictability, and Expectability. The results suggest that considering human preferences during learning improves Legibility and transparency. It also indicates that increased transparency enhances the safety, comfort, and reliability of the robot. The findings underscore the importance of transparency in learning and propose a paradigm for robotic applications with a human in the loop.

 

Publication date: 29 Nov 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2311.16838