This paper develops an integral value iteration (VI) method to efficiently find online the Nash equilibrium solution of two-player non-zero-sum (NZS) differential games for linear systems with partially unknown dynamics. To guarantee the closed-loop stability about the Nash equilibrium, the explicit upper bound for the discounted factor is given. To show the efficacy of the presented online model-free solution, the integral VI method is compared with the model-based off-line policy iteration method. Moreover, the theoretical analysis of the integral VI algorithm in terms of three aspects, i.e., positive definiteness properties of the updated cost functions, the stability of the closed-loop systems, and the conditions that guarantee the monotone convergence, is provided in detail. Finally, the simulation results demonstrate the efficacy of the presented algorithms.
Y. Yang et al., "Data-Driven Integral Reinforcement Learning for Continuous-Time Non-Zero-Sum Games," IEEE Access, vol. 7, pp. 82901-82912, Institute of Electrical and Electronics Engineers (IEEE), Jun 2019.
The definitive version is available at https://doi.org/10.1109/ACCESS.2019.2923845
Electrical and Computer Engineering
Keywords and Phrases
Closed loop systems; Computation theory; Continuous time systems; Cost functions; Game theory; Integral equations; Iterative methods; Linear systems; Machine learning; Online systems; Riccati equations; Closed loop stability; Coupled Riccati equations; Differential games; Monotone convergence; Optimal controls; Policy iteration; Positive definiteness; Zero-sum game; Reinforcement learning; Integral reinforcement learning; Non-zero-sum games
International Standard Serial Number (ISSN)
Article - Journal
© 2019 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Creative Commons Licensing
This work is licensed under a Creative Commons Attribution 3.0 License.