Safety-Aware Reinforcement Learning Framework with an Actor-Critic-Barrier Structure
Abstract
This paper considers the control problem with constraints on full-state and control input simultaneously. First, a novel barrier function based system transformation approach is developed to guarantee the full-state constraints. To deal with the input saturation, the hyperbolic-type penalty function is imposed on the control input. The actor-critic based reinforcement learning technique is combined with the barrier transformation to learn the optimal control policy that considers both the full-state constraints and input saturations. To illustrate the efficacy, a numeric simulation is implemented in the end.
Recommended Citation
Y. Yang et al., "Safety-Aware Reinforcement Learning Framework with an Actor-Critic-Barrier Structure," Proceedings of the American Control Conference (2019, Philadelphia, PA), pp. 2352 - 2358, Institute of Electrical and Electronics Engineers (IEEE), Jul 2019.
The definitive version is available at https://doi.org/10.23919/ACC.2019.8815335
Meeting Name
2019 American Control Conference, ACC 2019 (2019: Jul. 10-12, Philadelphia, PA)
Department(s)
Electrical and Computer Engineering
Research Center/Lab(s)
Intelligent Systems Center
Second Research Center/Lab
Center for High Performance Computing Research
Keywords and Phrases
Full-State Constraints; Input Saturation; Reinforcement Learning; Safe Control
International Standard Book Number (ISBN)
978-153867926-5
International Standard Serial Number (ISSN)
0743-1619
Document Type
Article - Conference proceedings
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2019 American Automatic Control Council, All rights reserved.
Publication Date
01 Jul 2019
Comments
This work was supported in part by the Fundamental Research Funds for the China Central Universities of USTB under grant No. FRF-TP-18-031A1 and No. FRF-BD-17-002A, in part by the China Post-Doctoral Science Foundation under Grant 2018M641197, in part by the National Science Foundation under grant NSF CAREER CPS-1851588, in part by NATO under grant No. SPS G5176, in part by ONR Minerva under grant No. N00014-18-1-2160, in part by the Mary K. Finley Endowment, in part by the Missouri S&T Intelligent Systems Center and in part by the Army Research Laboratory under Cooperative Agreement Number W911NF-18-2-0260.