Electrical and Computer Engineering Faculty Research & Creative Works

A Policy Iteration Approach to Online Optimal Control of Continuous-Time Constrained-Input Systems

Hamidreza Modares, Missouri University of Science and TechnologyFollow
Mohammad-Bagher Naghibi-Sistani
Frank L. Lewis

Abstract

This paper is an effort towards developing an online learning algorithm to find the optimal control solution for continuous-time (CT) systems subject to input constraints. The proposed method is based on the policy iteration (PI) technique which has recently evolved as a major technique for solving optimal control problems. Although a number of online PI algorithms have been developed for CT systems, none of them take into account the input constraints caused by actuator saturation. In practice, however, ignoring these constraints leads to performance degradation or even system instability. In this paper, to deal with the input constraints, a suitable nonquadratic functional is employed to encode the constraints into the optimization formulation. Then, the proposed PI algorithm is implemented on an actor-critic structure to solve the Hamilton-Jacobi-Bellman (HJB) equation associated with this nonquadratic cost functional in an online fashion. That is, two coupled neural network (NN) approximators, namely an actor and a critic are tuned online and simultaneously for approximating the associated HJB solution and computing the optimal control policy. The critic is used to evaluate the cost associated with the current policy, while the actor is used to find an improved policy based on information provided by the critic. Convergence to a close approximation of the HJB solution as well as stability of the proposed feedback control law are shown. Simulation results of the proposed method on a nonlinear CT system illustrate the effectiveness of the proposed approach.

Recommended Citation

H. Modares et al., "A Policy Iteration Approach to Online Optimal Control of Continuous-Time Constrained-Input Systems," ISA Transactions, vol. 52, no. 5, pp. 611 - 621, Elsevier, Sep 2013.

The definitive version is available at https://doi.org/10.1016/j.isatra.2013.04.004

Department(s)

Electrical and Computer Engineering

Keywords and Phrases

Hamilton-Jacobi-Bellman Equations; Input Constraints; Online Learning Algorithms; Optimal Control Solution; Optimal Controls; Optimization Formulations; Performance Degradation; Policy Iteration; Algorithms; Continuous Time Systems; Control; Iterative Methods; Neural Networks; Online Systems; Reinforcement Learning; System Stability; Optimal Control Systems; Optimal Control

International Standard Serial Number (ISSN)

0019-0578

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Sep 2013

Link to Full Text

COinS

Electrical and Computer Engineering Faculty Research & Creative Works

A Policy Iteration Approach to Online Optimal Control of Continuous-Time Constrained-Input Systems

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations

Electrical and Computer Engineering Faculty Research & Creative Works

A Policy Iteration Approach to Online Optimal Control of Continuous-Time Constrained-Input Systems

Author

Abstract

Recommended Citation

Department(s)

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations