Heuristic Dynamic Programming for Mobile Robot Path Planning Based on Dyna Approach


This paper presents a direct heuristic dynamic programming (HDP) based on Dyna planning (Dyna-HDP) for online model learning in a Markov decision process. This novel technique is composed of HDP policy learning to construct the Dyna agent for speeding up the learning time. We evaluate Dyna-HDP on a differential-drive wheeled mobile robot navigation problem in a 2D maze. The simulation is introduced to compare Dyna-HDP with other traditional reinforcement learning algorithms, namely one step Q-learning, Sarsa (λ), and Dyna-Q, under the same benchmark conditions. We demonstrate that Dyna-HDP has a faster near-optimal path than other algorithms, with high stability. In addition, we also confirm that the Dyna-HDP method can be applied in a multi-robot path planning problem. The virtual common environment model is learned from sharing the robots' experiences which significantly reduces the learning time.

Meeting Name

International Joint Conference on Neural Networks (2016: Jul. 24-29, Vancouver, Canada)


Electrical and Computer Engineering

Research Center/Lab(s)

Center for High Performance Computing Research

Second Research Center/Lab

Intelligent Systems Center

Keywords and Phrases

Heuristic Programming; Learning Algorithms; Markov Processes; Mobile Robots; Motion Planning; Reinforcement Learning; Robot Programming; Robots; Direct Heuristic Dynamic Programming; Dyna; Heuristic Dynamic Programming; Mobile Robotic; Multi-Robot Path Planning; Q-Learning; Sarsa; Traditional Reinforcements; Dynamic Programming

Document Type

Article - Conference proceedings

Document Version


File Type





© 2016 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.

Publication Date

01 Jul 2016