Adaptive Dynamic Programming in the Hamiltonian-Driven Framework
This chapter presents a Hamiltonian-driven framework of adaptive dynamic programming (ADP) for continuous-time nonlinear systems. Three fundamental problems for solving the optimal control problem are presented, i.e., the evaluation of given admissible policy, the comparison between two different admissible policies with respect to the performance, and the performance improvement of given admissible control. It is shown that the Hamiltonian functional can be viewed as the temporal difference for dynamical systems in continuous time. Therefore, the minimization of the Hamiltonian functional is equivalent to the value function approximation. An iterative algorithm starting from an arbitrary admissible control is presented for the optimal control approximation with its convergence proof. The Hamiltonian-driven ADP algorithm can be implemented using a critic only structure, which is trained to approximate the optimal value gradient. Simulation example is conducted to verify the effectiveness of Hamiltonian-driven ADP.
Y. Yang et al., "Adaptive Dynamic Programming in the Hamiltonian-Driven Framework," Handbook of Reinforcement Learning and Control, vol. 325, pp. 189-214, Springer, Jan 2021.
The definitive version is available at https://doi.org/10.1007/978-3-030-60990-0_7
Electrical and Computer Engineering
Keywords and Phrases
Adaptive Dynamic Programming; Hamiltonian-Driven Framework; Temporal Difference; Value Gradient Learning
International Standard Serial Number (ISSN)
Book - Chapter
© 2021 Springer, All rights reserved.
01 Jan 2021