Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems using Off-Policy Reinforcement Learning
A model-free off-policy reinforcement learning algorithm is developed to learn the optimal output-feedback (OPFB) solution for linear continuous-time systems. The proposed algorithm has the important feature of being applicable to the design of optimal OPFB controllers for both regulation and tracking problems. To provide a unified framework for both optimal regulation and tracking, a discounted performance function is employed and a discounted algebraic Riccati equation (ARE) is derived which gives the solution to the problem. Conditions on the existence of a solution to the discounted ARE are provided and an upper bound for the discount factor is found to assure the stability of the optimal control solution. To develop an optimal OPFB controller, it is first shown that the system state can be constructed using some limited observations on the system output over a period of the history of the system. A Bellman equation is then developed to evaluate a control policy and find an improved policy simultaneously using only some limited observations on the system output. Then, using this Bellman equation, a model-free Off-policy RL-based OPFB controller is developed without requiring the knowledge of the system state or the system dynamics. It is shown that the proposed OPFB method is more powerful than the static OPFB as it is equivalent to a state-feedback control policy. The proposed method is successfully used to solve a regulation and a tracking problem.
H. Modares et al., "Optimal Output-Feedback Control of Unknown Continuous-Time Linear Systems using Off-Policy Reinforcement Learning," IEEE Transactions on Cybernetics, vol. 46, no. 11, pp. 2401-2410, Institute of Electrical and Electronics Engineers (IEEE), Nov 2016.
The definitive version is available at https://doi.org/10.1109/TCYB.2015.2477810
Electrical and Computer Engineering
Keywords and Phrases
Continuous Time Systems; Controllers; Dynamic Programming; Feedback Control; Learning Algorithms; Linear Systems; Nonlinear Control Systems; Reinforcement Learning; Riccati Equations; State Feedback; Algebraic Riccati Equations; Continuous-Time Linear Systems; Linear Continuous-Time System; Optimal Control Solution; Optimal Controls; Optimal Output Feedback Control; Output Data; Output Feedback; Feedback; Measured Output Data; Off-Policy Reinforcement Learning; Optimal Control; Output Feedback (OPFB)
International Standard Serial Number (ISSN)
Article - Journal
© 2016 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
01 Nov 2016