Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning

The article discusses the use of pretrained Vision-Language Models (VLMs) as zero-shot reward models (RMs) for Reinforcement Learning (RL). The authors propose a method, called VLM-RMs, that uses these models to specify tasks via natural language, eliminating the need for manually specified reward functions. This method was tested using a MuJoCo humanoid robot, which was trained to perform complex tasks like kneeling and doing the splits, based on single sentence text prompts. The study found that larger VLMs trained with more compute and data make better reward models.

Publication date: 19 Oct 2023
Project Page: https://sites.google.com/view/vlm-rm
Paper: https://arxiv.org/pdf/2310.12921

Post Views: 310

Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Digital Twin-Enabled Intelligent DDoS Detection Mechanism for Autonomous Core Networks

Generative Marginalization Models

Leave a Reply Cancel reply

Please allow ads on our site