Heuristic Dynamic Programming for Mobile Robot Path Planning Based on Dyna Approach
This paper presents a direct heuristic dynamic programming (HDP) based on Dyna planning (Dyna-HDP) for online model learning in a Markov decision process. This novel technique is composed of HDP policy learning to construct the Dyna agent for speeding up the learning time. We evaluate Dyna-HDP on a differential-drive wheeled mobile robot navigation problem in a 2D maze. The simulation is introduced to compare Dyna-HDP with other traditional reinforcement learning algorithms, namely one step Q-learning, Sarsa (λ), and Dyna-Q, under the same benchmark conditions. We demonstrate that Dyna-HDP has a faster near-optimal path than other algorithms, with high stability. In addition, we also confirm that the Dyna-HDP method can be applied in a multi-robot path planning problem. The virtual common environment model is learned from sharing the robots' experiences which significantly reduces the learning time.
D. S. Al and D. C. Wunsch, "Heuristic Dynamic Programming for Mobile Robot Path Planning Based on Dyna Approach," Proceedings of the International Joint Conference on Neural Networks (2016, Vancouver, Canada), Institute of Electrical and Electronics Engineers (IEEE), Jul 2016.
The definitive version is available at https://doi.org/10.1109/IJCNN.2016.7727679
International Joint Conference on Neural Networks (2016: Jul. 24-29, Vancouver, Canada)
Electrical and Computer Engineering
Center for High Performance Computing Research
Keywords and Phrases
Heuristic Programming; Learning Algorithms; Markov Processes; Mobile Robots; Motion Planning; Reinforcement Learning; Robot Programming; Robots; Direct Heuristic Dynamic Programming; Dyna; Heuristic Dynamic Programming; Mobile Robotic; Multi-Robot Path Planning; Q-Learning; Sarsa; Traditional Reinforcements; Dynamic Programming
Article - Conference proceedings
© 2016 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.