The paper discusses the effectiveness of scalar feedback over binary feedback in reinforcement learning. While binary feedback has been more widely used due to its simplicity and reduced noise, scalar feedback offers a wider range of values, which can provide more detailed information. However, scalar feedback is considered to be more unstable and noisy. To address this issue, the authors introduce a method called STEADY, which helps to stabilize and improve learning from scalar feedback. The results show that models trained with scalar feedback and STEADY outperform those trained with binary feedback and raw scalar feedback in a robot task.

 

Publication date: 20 Nov 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2311.10284