The paper discusses the challenges of optimizing the cost function parameters of Model Predictive Control (MPC) for autonomous vehicles. The authors propose using Reinforcement Learning (RL) to learn optimal parameter sets depending on the context and dynamically adapt them during operation. This approach addresses safety concerns, as learning from scratch in a continuous action space can lead to unsafe operating states. The RL agent learns not in a continuous space but anticipates upcoming control tasks and chooses the most optimal discrete actions. This ensures even an untrained RL agent guarantees safe and optimal performance. Experimental results show that an untrained RL-MPC exhibits Pareto-optimal behavior, and training further enhances performance.
Publication date: 6 Feb 2024
Project Page: Not Provided
Paper: https://arxiv.org/pdf/2402.02624