The research addresses the challenge of enforcing safety in Reinforcement Learning (RL), crucial for its application in real-world scenarios. The researchers propose a method of overcoming the non-contractiveness of safety critic operators by treating safety as a binary property. They present the properties of the binary safety critic associated with a deterministic system that aims to avoid unsafe regions. They also provide an algorithm that uses the knowledge of safe data to avoid spurious fixed points. The study contributes to the development of safety-critical systems and safe reinforcement learning.
Publication date: 24 Jan 2024
Project Page: Not Provided
Paper: https://arxiv.org/pdf/2401.12849