Hamiltonian-Driven Adaptive Dynamic Programming for Continuous Nonlinear Dynamical Systems
Abstract
This paper presents a Hamiltonian-driven framework of adaptive dynamic programming (ADP) for continuous time nonlinear systems, which consists of evaluation of an admissible control, comparison between two different admissible policies with respect to the corresponding the performance function, and the performance improvement of an admissible control. It is showed that the Hamiltonian can serve as the temporal difference for continuous-time systems. In the Hamiltonian-driven ADP, the critic network is trained to output the value gradient. Then, the inner product between the critic and the system dynamics produces the value derivative. Under some conditions, the minimization of the Hamiltonian functional is equivalent to the value function approximation. An iterative algorithm starting from an arbitrary admissible control is presented for the optimal control approximation with its convergence proof. The implementation is accomplished by a neural network approximation. Two simulation studies demonstrate the effectiveness of Hamiltonian-driven ADP.
Recommended Citation
Y. Yang et al., "Hamiltonian-Driven Adaptive Dynamic Programming for Continuous Nonlinear Dynamical Systems," IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 8, pp. 1929 - 1940, Institute of Electrical and Electronics Engineers (IEEE), Aug 2017.
The definitive version is available at https://doi.org/10.1109/TNNLS.2017.2654324
Department(s)
Electrical and Computer Engineering
Research Center/Lab(s)
Intelligent Systems Center
Second Research Center/Lab
Center for High Performance Computing Research
Keywords and Phrases
Adaptive control systems; Approximation algorithms; Continuous time systems; Dynamical systems; Hamiltonians; Iterative methods; Nonlinear control systems; Nonlinear dynamical systems; Adaptive dynamic programming; Continuous time nonlinear systems; Hamiltonian functional; Iterative algorithm; Neural network approximation; Performance functions; Temporal differences; Value function approximation; Dynamic programming; Adaptive dynamic programming (ADP); Convergence proof; Hamiltonian-driven framework; Neural network (NN) approximation; Value function
International Standard Serial Number (ISSN)
2162-237X; 2162-2388
Document Type
Article - Journal
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2017 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Publication Date
01 Aug 2017
Comments
This work was supported in part by the Mary K. Finley Missouri Endowment, in part by the Missouri S&T Intelligent Systems Center, in part by the National Science Foundation, in part by the National Natural Science Foundation of China under Grant 61333002, and in part by the China Scholarship Council under Grant 201406460057.