Abstract

This paper develops a novel off-policy game Q-learning algorithm to solve the anti-interference control problem for discrete-time linear multi-player systems using only data without requiring system matrices to be known. The primary contribution of this paper lies in that the Q-learning strategy employed in the proposed algorithm is implemented in an off-policy policy iteration approach other than on-policy learning due to the well-known advantages of off-policy Q-learning over on-policy Q-learning. All of the players work hard together for the goal of minimizing their common performance index meanwhile defeating the disturbance that tries to maximize the specific performance index, and finally they reach the Nash equilibrium of the game resulting in satisfying disturbance attenuation condition. In order to find the solution to the Nash equilibrium, the anti-interference control problem is first transformed into an optimal control problem. Then an off-policy Q-learning algorithm is proposed in the framework of typical adaptive dynamic programming (ADP) and game architecture, such that control policies of all players can be learned using only measured data. Comparative simulation results are provided to verify the effectiveness of the proposed method.

Department(s)

Electrical and Computer Engineering

Second Department

Computer Science

Publication Status

Full Text Access

Comments

National Natural Science Foundation of China, Grant 2019-KF-03-06

Keywords and Phrases

Game theory; H control ∞; Nash equilibrium; Off-policy Q-learning

International Standard Serial Number (ISSN)

2405-8963

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2024 Elsevier; International Federation of Automatic Control (IFAC), All rights reserved.

Publication Date

01 Jan 2020

Share

 
COinS