Electrical and Computer Engineering Faculty Research & Creative Works

Reinforcement Q-Learning for Optimal Tracking Control of Linear Discrete-Time Systems with Unknown Dynamics

Bahare Kiumarsi
Frank L. Lewis
Hamidreza Modares, Missouri University of Science and TechnologyFollow
Ali Karimpour
Mohammad-Bagher Naghibi-Sistani

Abstract

In this paper, a novel approach based on the Q-learning algorithm is proposed to solve the infinite-horizon linear quadratic tracker (LQT) for unknown discrete-time systems in a causal manner. It is assumed that the reference trajectory is generated by a linear command generator system. An augmented system composed of the original system and the command generator is constructed and it is shown that the value function for the LQT is quadratic in terms of the state of the augmented system. Using the quadratic structure of the value function, a Bellman equation and an augmented algebraic Riccati equation (ARE) for solving the LQT are derived. In contrast to the standard solution of the LQT, which requires the solution of an ARE and a noncausal difference equation simultaneously, in the proposed method the optimal control input is obtained by only solving an augmented ARE. A Q-learning algorithm is developed to solve online the augmented ARE without any knowledge about the system dynamics or the command generator. Convergence to the optimal solution is shown. A simulation example is used to verify the effectiveness of the proposed control scheme.

Recommended Citation

B. Kiumarsi et al., "Reinforcement Q-Learning for Optimal Tracking Control of Linear Discrete-Time Systems with Unknown Dynamics," Automatica, vol. 50, no. 4, pp. 1167 - 1175, Elsevier, Apr 2014.

The definitive version is available at https://doi.org/10.1016/j.automatica.2014.02.015

Department(s)

Electrical and Computer Engineering

Keywords and Phrases

Algebra; Difference Equations; Digital Control Systems; Discrete Time Control Systems; Iterative Methods; Navigation; Reinforcement Learning; Riccati Equations; Algebraic Riccati Equations; Discrete Time System; Linear Discrete-Time Systems; Linear Quadratic Trackers; Optimal Tracking Control; Policy Iteration; Q-Learning Algorithms; Reference Trajectories; Learning Algorithms; Algebraic Riccati Equation; Linear Quadratic Tracker

International Standard Serial Number (ISSN)

0005-1098

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Apr 2014

Link to Full Text

COinS

Electrical and Computer Engineering Faculty Research & Creative Works

Reinforcement Q-Learning for Optimal Tracking Control of Linear Discrete-Time Systems with Unknown Dynamics

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations

Electrical and Computer Engineering Faculty Research & Creative Works

Reinforcement Q-Learning for Optimal Tracking Control of Linear Discrete-Time Systems with Unknown Dynamics

Author

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations