Towards Robust Offline Reinforcement Learning under Diverse Data Corruption

The research focuses on the challenges of offline reinforcement learning (RL), particularly its vulnerability to data corruption. The authors investigate the performance of various offline RL algorithms under different types of data corruption, finding that Implicit Q-learning (IQL) is relatively resilient. Despite this, IQL is still susceptible to dynamics corruption. To address this, the authors propose a robust version of IQL (RIQL) that uses Huber loss and quantile estimators to balance penalties for corrupted data and learning stability. Experiments show that RIQL performs robustly in a variety of data corruption scenarios.

Publication date: 20 Oct 2023
Project Page: https://arxiv.org/pdf/2310.12955v1.pdf
Paper: https://arxiv.org/pdf/2310.12955

Post Views: 331

Towards Robust Offline Reinforcement Learning under Diverse Data Corruption

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

Eureka-Moments in Transformers: Multi-Step Tasks Reveal Softmax Induced Optimization Problems

Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation

Leave a Reply Cancel reply

Please allow ads on our site