Parallel Q-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

The paper introduces Parallel Q-Learning (PQL), a technique that optimizes off-policy reinforcement learning for large-scale GPU-based simulations. PQL is designed to leverage the superior sample efficiency of off-policy learning while outperforming on-policy methods in terms of wall-clock time. It achieves this by simultaneously collecting data, learning policies, and determining values. This approach distinguishes PQL from previous distributed off-policy learning efforts, making it highly effective in massively parallel GPU-based simulations.

Publication date: July 24, 2023
Project Page: https://github.com/Improbable-AI/pql
Paper: https://arxiv.org/pdf/2307.12983.pdf

Post Views: 404

GPU-based Simulation, Machine Learning, Off-policy Learning, Parallel Q-Learning, Reinforcement Learning

Parallel Q-Learning: Scaling Off-policy Reinforcement Learning under Massively Parallel Simulation

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

3D-LLM: Injecting the 3D World into Large Language Models

A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models

Leave a Reply Cancel reply

Please allow ads on our site