Overwhelming computational requirements of classical dynamic programming algorithms render them inapplicable to most practical stochastic problems. To overcome this problem a neural network based Dynamic Programming (DP) approach is described in this study. The cost function which is critical in a dynamic programming formulation is approximated by a neural network according to some designed weight-update rule based on Temporal Difference(TD)learning. A Lyapunov based theory is developed to guarantee an upper error bound between the output of the cost neural network and the true cost. We illustrate this approach through a retailer inventory problem.
Z. Huang et al., "Stochastic Optimal Control with Neural Networks and Application to a Retailer Inventory Problem," Proceedings of the 44th IEEE Conference on Decision and Control, Institute of Electrical and Electronics Engineers (IEEE), Jan 2005.
The definitive version is available at https://doi.org/10.1109/CDC.2005.1582874
44th IEEE Conference on Decision and Control
Mechanical and Aerospace Engineering
Article - Conference proceedings
© 2005 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.