Abstract

This article presents a novel efficient experience-replay-based adaptive dynamic programming (ADP) for the optimal control problem of a class of nonlinear dynamical systems within the Hamiltonian-driven framework. The quasi-Hamiltonian is presented for the policy evaluation problem with an admissible policy. With the quasi-Hamiltonian, a novel composite critic learning mechanism is developed to combine the instantaneous data with the historical data. In addition, the pseudo-Hamiltonian is defined to deal with the performance optimization problem. Based on the pseudo-Hamiltonian, the conventional Hamilton–Jacobi–Bellman (HJB) equation can be represented in a filtered form, which can be implemented online. Theoretical analysis is investigated in terms of the convergence of the adaptive critic design and the stability of the closed-loop systems, where parameter convergence can be achieved under a weakened excitation condition. Simulation studies are investigated to verify the efficacy of the presented design scheme.

Department(s)

Electrical and Computer Engineering

Keywords and Phrases

Convergence; Dynamic programming; Hamiltonian-driven adaptive dynamic programming (ADP); Hamilton–Jacobi–Bellman (HJB) equation; Iterative algorithms; Learning systems; Mathematical models; Optimal control; Optimization; pseudo-Hamiltonian; quasi-Hamiltonian; relaxed excitation condition

International Standard Serial Number (ISSN)

2162-2388; 2162-237X

Document Type

Article - Journal

Document Version

Final Version

File Type

text

Language(s)

English

Rights

© 2023 Institute of Electrical and Electronics Engineers, All rights reserved.

Publication Date

01 Jan 2022

Share

 
COinS