A new reinforcement learning algorithm with fixed exploration for semi-Markov decision processes
Keywords and Phrases
Artificial Intelligence; iSMART; Q-Learning; Reinforcement Learning; RSMART
"Artificial intelligence or machine learning techniques are currently being widely applied for solving problems within the field of data analytics. This work presents and demonstrates the use of a new machine learning algorithm for solving semi-Markov decision processes (SMDPs). SMDPs are encountered in the domain of Reinforcement Learning to solve control problems in discrete-event systems. The new algorithm developed here is called iSMART, an acronym for imaging Semi-Markov Average Reward Technique. The algorithm uses a constant exploration rate, unlike its precursor R-SMART, which required exploration decay. The major difference between R-SMART and iSMART is that the latter uses, in addition to the regular iterates of R-SMART, a set of so-called imaging iterates, which form an image of the regular iterates and allow iSMART to avoid exploration decay. The new algorithm is tested extensively on small-scale SMDPs and on large-scale problems from the domain of Total Productive Maintenance (TPM). The algorithm shows encouraging performance on all the cases studied"--Abstract, page iii.
Enke, David Lee, 1965-
Engineering Management and Systems Engineering
M.S. in Systems Engineering
Missouri University of Science and Technology
x, 41 pages
Note about bibliography
Includes bibliographical references (pages 38-40).
© 2017 Angelo Michael Encapera, All rights reserved.
Thesis - Open Access
Electronic OCLC #
Encapera, Angelo Michael, "A new reinforcement learning algorithm with fixed exploration for semi-Markov decision processes" (2017). Masters Theses. 7736.
Artificial Intelligence and Robotics Commons, Operations Research, Systems Engineering and Industrial Engineering Commons