We discuss a variety of adaptive critic designs (ACDs) for neurocontrol. These are suitable for learning in noisy, nonlinear, and nonstationary environments. They have common roots as generalizations of dynamic programming for neural reinforcement learning approaches. Our discussion of these origins leads to an explanation of three design families: heuristic dynamic programming, dual heuristic programming, and globalized dual heuristic programming (GDHP). The main emphasis is on DHP and GDHP as advanced ACDs. We suggest two new modifications of the original GDHP design that are currently the only working implementations of GDHP. They promise to be useful for many engineering applications in the areas of optimization and optimal control. Based on one of these modifications, we present a unified approach to all ACDs. This leads to a generalized training procedure for ACDs
D. C. Wunsch and D. V. Prokhorov, "Adaptive Critic Designs," IEEE Transactions on Neural Networks, Institute of Electrical and Electronics Engineers (IEEE), Jan 1997.
The definitive version is available at http://dx.doi.org/10.1109/72.623201
Electrical and Computer Engineering
Keywords and Phrases
Adaptive Control; Adaptive Critic Designs; Backpropagation; Duality (Mathematics); Dynamic Programming; Generalisation (Artificial Intelligence); Generalizations; Generalized Training Procedure; Globalized Dual Heuristic Programming; Heuristic Programming; Neural Nets; Neurocontrol; Neurocontrollers; Optimal Control; Optimization; Reinforcement Learning
International Standard Serial Number (ISSN)
Article - Conference proceedings
© 1997 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.