Leveraging Approximate Model-based Shielding for Probabilistic Safety Guarantees in Continuous Environments

This article explores the use of shielding in reinforcement learning (RL) for continuous environments. Classical shielding approaches have limitations, making them hard to use in complex environments. The authors extend the approximate model-based shielding (AMBS) framework to these continuous settings, using Safety Gym as the test-bed. They offer strong probabilistic safety guarantees and propose two new penalty techniques that modify the policy gradient. The results show stable convergence. The paper is a significant contribution to the field of RL, offering ways to ensure safety in complex, continuous environments.

Publication date: 2 Feb 2024
Project Page: N/A
Paper: https://arxiv.org/pdf/2402.00816

Post Views: 248

Press ESC to close

Share Article:

root

SLIM: Skill Learning with Multiple Critics

Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI

Please allow ads on our site