Optimal Controller Design for Control-Affine Stochastic Systems using Neural Networks and Path Integrals
Abstract
For deterministic nonlinear dynamical systems, approximate dynamic programming based on Pontryagin's maximum principle provides a systematic way to solve optimal control problems. However, in the presence of noise, this approach becomes cumbersome. Hence, in current optimal control solution methodologies noise effect is typically ignored in the adjoint equations. Alternatively, in the Hamilton-Jacobi Bellman (HJB) framework, presence of noise results in the second order stochastic HJB equation. Furthermore, through a unique exponential transformation, the stochastic HJB equation of control-affine nonlinear stochastic systems with quadratic control cost function can be transformed into a path integral. In this paper, an offline approximate dynamic programming approach using neural networks and path integrals is proposed for solving the above class of finite horizon stochastic optimal control problems. Simulation results using Vanderpol oscillator model are presented to demonstrate the potential of the proposed approach.
Recommended Citation
K. Rajagopal et al., "Optimal Controller Design for Control-Affine Stochastic Systems using Neural Networks and Path Integrals," Proceedings of the 2016 Annual American Control Conference (2016, Boston, MA), pp. 3032 - 3037, Institute of Electrical and Electronics Engineers (IEEE), Jul 2016.
The definitive version is available at https://doi.org/10.1109/ACC.2016.7525381
Meeting Name
2016 Annual American Control Conference, ACC (2016: Jul. 6-8, Boston, MA)
Department(s)
Mechanical and Aerospace Engineering
International Standard Book Number (ISBN)
978-1-4673-8682-1
International Standard Serial Number (ISSN)
2378-5861
Document Type
Article - Conference proceedings
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2016 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Publication Date
01 Jul 2016
Comments
This research was partially supported by Air Force Office of Scientific Research Grant NO. AFOSR FA9550-12-1-0397.