The research paper discusses the potential of causal knowledge in improving the interpretability of reinforcement learning (RL) agents’ decision-making process. The authors propose a framework that alternates between using interventions for causal structure learning during exploration and using the learned causal structure for policy guidance during exploitation. This approach is tested in a simulated fault alarm environment, demonstrating its effectiveness and robustness against other methods. The improvement in performance is attributed to the cycle of causal-guided policy learning and causal structure learning.
Publication date: 7 Feb 2024
Project Page: https://arxiv.org/abs/2402.04869v1
Paper: https://arxiv.org/pdf/2402.04869