Critic-Guided Decision Transformer for Offline Reinforcement Learning

The article introduces the Critic-Guided Decision Transformer (CGDT), a novel approach to offline reinforcement learning (RL). Traditional Return-Conditioned Supervised Learning (RCSL) struggles with stochastic environments and diverse future trajectory distributions. CGDT addresses these issues by combining the predictability of long-term returns from value-based methods with the trajectory modeling capability of the Decision Transformer. It incorporates a learned value function, the critic, ensuring alignment between target returns and expected returns of actions. This addresses the inconsistency between sampled returns within individual trajectories and expected returns across multiple trajectories. Empirical evaluations show CGDT’s superiority over traditional RCSL methods in stochastic environments and D4RL benchmark datasets.

Publication date: 22 Dec 2023
Project Page: Not provided
Paper: https://arxiv.org/pdf/2312.13716

Post Views: 296

Press ESC to close

Share Article:

root

Sparse Training for Federated Learning with Regularized Error Correction

A Learning oriented DLP System based on Classification Model

Please allow ads on our site