This academic paper presents ACTER (Actionable CounTerfactual Sequences for Explaining Reinforcement Learning Outcomes), a new algorithm for generating counterfactual sequences to help understand and prevent failures in reinforcement learning. The authors show that ACTER can provide actionable advice for avoiding negative outcomes. The algorithm uses NSGA-II, an evolutionary algorithm, to generate sequences of actions that prevent failure with minimal changes, even in uncertain environments. ACTER also generates multiple diverse counterfactual sequences, allowing users to correct failures in a way that suits their preferences. The authors also introduce three metrics for evaluating the diversity of these sequences. The effectiveness of ACTER is evaluated in two reinforcement learning environments, with both discrete and continuous actions.
Publication date: 12 Feb 2024
Project Page: Not provided
Paper: https://arxiv.org/pdf/2402.06503