Score Models for Offline Goal-Conditioned Reinforcement Learning

The study introduces a novel approach to Offline Goal-Conditioned Reinforcement Learning (GCRL), named SMORe. GCRL is crucial for creating generalist agents that can use existing datasets to learn diverse skills without needing hand-engineered reward functions. Existing GCRL approaches often underperform in offline settings. SMORe, however, overcomes these limitations by combining the occupancy matching perspective of GCRL with a convex dual formulation. It learns scores or unnormalized densities representing the importance of taking an action at a state for reaching a particular goal. The authors’ experiments show that SMORe significantly outperforms other methods in robot manipulation and locomotion tasks.

Publication date: 03 Nov 2023
Project Page: https://arxiv.org/abs/2311.02013v1
Paper: https://arxiv.org/pdf/2311.02013

Post Views: 284

Score Models for Offline Goal-Conditioned Reinforcement Learning

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

DeliverAI: Reinforcement Learning Based Distributed Path-Sharing Network for Food Deliveries

Conditions on Preference Relations that Guarantee the Existence of Optimal Policies

Leave a Reply Cancel reply

Please allow ads on our site