Abstract
This paper develops an integral value iteration (VI) method to efficiently find online the Nash equilibrium solution of two-player non-zero-sum (NZS) differential games for linear systems with partially unknown dynamics. To guarantee the closed-loop stability about the Nash equilibrium, the explicit upper bound for the discounted factor is given. To show the efficacy of the presented online model-free solution, the integral VI method is compared with the model-based off-line policy iteration method. Moreover, the theoretical analysis of the integral VI algorithm in terms of three aspects, i.e., positive definiteness properties of the updated cost functions, the stability of the closed-loop systems, and the conditions that guarantee the monotone convergence, is provided in detail. Finally, the simulation results demonstrate the efficacy of the presented algorithms.
Recommended Citation
Y. Yang et al., "Data-Driven Integral Reinforcement Learning for Continuous-Time Non-Zero-Sum Games," IEEE Access, vol. 7, pp. 82901 - 82912, Institute of Electrical and Electronics Engineers (IEEE), Jun 2019.
The definitive version is available at https://doi.org/10.1109/ACCESS.2019.2923845
Department(s)
Electrical and Computer Engineering
Research Center/Lab(s)
Center for High Performance Computing Research
Keywords and Phrases
Closed loop systems; Computation theory; Continuous time systems; Cost functions; Game theory; Integral equations; Iterative methods; Linear systems; Machine learning; Online systems; Riccati equations; Closed loop stability; Coupled Riccati equations; Differential games; Monotone convergence; Optimal controls; Policy iteration; Positive definiteness; Zero-sum game; Reinforcement learning; Integral reinforcement learning; Non-zero-sum games
International Standard Serial Number (ISSN)
2169-3536
Document Type
Article - Journal
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2019 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Creative Commons Licensing
This work is licensed under a Creative Commons Attribution 3.0 License.
Publication Date
01 Jun 2019
Comments
This work was supported in part by the National Natural Science Foundation of China under Grant 61873028 and 61333002, in part by the China Postdoctoral Science Foundation under Grant 2018M641197, in part by the Fundamental Research Funds for the China Central Universities of USTB under Grant FRF-TP-18-031A1, FRF-BD-17-002A, and FRF-GF-17-B48, in part by the Mary K. Finley Endowment, in part by the Missouri S&T Intelligent Systems Center, in part by the Lifelong Learning Machines Program from the DARPA/Microsystems Technology Office, and in part by the Army Research Laboratory through a Cooperative Agreement under Grant W911NF-18-2-0260.