The article discusses ACE (Off-Policy Actor-Critic with Causality-Aware Entropy Regularization), a novel reinforcement learning (RL) algorithm. Traditional RL algorithms often overlook the varying significance of distinct primitive behaviors during the policy learning process. ACE addresses this by exploring the causal relationship between different action dimensions and rewards, allowing it to evaluate the significance of various primitive behaviors during training. This approach helps identify and prioritize actions with high potential impacts for efficient exploration. The algorithm also introduces a dormancy-guided reset mechanism to prevent excessive focus on specific behaviors. ACE has demonstrated significant performance advantages across diverse continuous control tasks compared to traditional RL baselines, highlighting its effectiveness and efficiency.
Publication date: 23 Feb 2024
Project Page: https://ace-rl.github.io/
Paper: https://arxiv.org/pdf/2402.14528