Electrical and Computer Engineering Faculty Research & Creative Works

Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems using Off-Policy Reinforcement Learning

Hamidreza Modares, Missouri University of Science and TechnologyFollow
Frank L. Lewis
Zhong-Ping Jiang

Abstract

A model-free off-policy reinforcement learning algorithm is developed to learn the optimal output-feedback (OPFB) solution for linear continuous-time systems. The proposed algorithm has the important feature of being applicable to the design of optimal OPFB controllers for both regulation and tracking problems. To provide a unified framework for both optimal regulation and tracking, a discounted performance function is employed and a discounted algebraic Riccati equation (ARE) is derived which gives the solution to the problem. Conditions on the existence of a solution to the discounted ARE are provided and an upper bound for the discount factor is found to assure the stability of the optimal control solution. To develop an optimal OPFB controller, it is first shown that the system state can be constructed using some limited observations on the system output over a period of the history of the system. A Bellman equation is then developed to evaluate a control policy and find an improved policy simultaneously using only some limited observations on the system output. Then, using this Bellman equation, a model-free Off-policy RL-based OPFB controller is developed without requiring the knowledge of the system state or the system dynamics. It is shown that the proposed OPFB method is more powerful than the static OPFB as it is equivalent to a state-feedback control policy. The proposed method is successfully used to solve a regulation and a tracking problem.

Recommended Citation

H. Modares et al., "Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems using Off-Policy Reinforcement Learning," IEEE Transactions on Cybernetics, vol. 46, no. 11, pp. 2401 - 2410, Institute of Electrical and Electronics Engineers (IEEE), Nov 2016.

The definitive version is available at https://doi.org/10.1109/TCYB.2015.2477810

Department(s)

Electrical and Computer Engineering

Keywords and Phrases

Continuous Time Systems; Controllers; Dynamic Programming; Feedback Control; Learning Algorithms; Linear Systems; Nonlinear Control Systems; Reinforcement Learning; Riccati Equations; State Feedback; Algebraic Riccati Equations; Continuous-Time Linear Systems; Linear Continuous-Time System; Optimal Control Solution; Optimal Controls; Optimal Output Feedback Control; Output Data; Output Feedback; Feedback; Measured Output Data; Off-Policy Reinforcement Learning; Optimal Control; Output Feedback (OPFB)

International Standard Serial Number (ISSN)

2168-2267

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Nov 2016

Link to Full Text

COinS

Electrical and Computer Engineering Faculty Research & Creative Works

Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems using Off-Policy Reinforcement Learning

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations

Electrical and Computer Engineering Faculty Research & Creative Works

Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems using Off-Policy Reinforcement Learning

Author

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations