Model-Free $δ$-Policy Iteration Based on Damped Newton Method for Nonlinear Continuous-Time H$\infty$ Tracking Control

This article introduces a new algorithm based on the damped Newton method for the H tracking control problem of unknown continuous-time nonlinear systems. The algorithm, named model-free Policy Iteration (PI), is derived from a generalized tracking Bellman equation and can find the optimal solution for the tracking Hamilton-Jacobi-Isaacs (HJI) equation. Two PI reinforcement learning methods, on-policy and off-policy, are detailed. The off-policy PI algorithm doesn’t require prior knowledge of system dynamics. The effectiveness of the algorithm is demonstrated with a nonlinear system simulation.

Publication date: 2023-12-10
Project Page: http://ieeexplore.ieee.org
Paper: https://arxiv.org/pdf/2401.12882

Post Views: 832

Model-Free $δ$-Policy Iteration Based on Damped Newton Method for Nonlinear Continuous-Time H$\infty$ Tracking Control

root

Leave a Reply Cancel reply

Press ESC to close

Share Article:

root

DsDm: Model-Aware Dataset Selection with Datamodels

Learning safety critics via a non-contractive binary bellman operator

Leave a Reply Cancel reply

Please allow ads on our site