Abstract
Reinforcement learning (RL) is a simulation-based technique useful in solving Markov decision processes if their transition probabilities are not easily obtainable or if the problems have a very large number of states. We present an empirical study of (i) the effect of step-sizes (learning rules) in the convergence of RL algorithms, (ii) stochastic shortest paths in solving average reward problems via RL, and (iii) the notion of survival probabilities (downside risk) in RL. We also study the impact of step sizes when function approximation is combined with RL. Our experiments yield some interesting insights that will be useful in practice when RL algorithms are implemented within simulators.
Recommended Citation
A. Gosavi, "On Step Sizes, Stochastic Shortest Paths, and Survival Probabilities in Reinforcement Learning," Winter Simulation Conference, Institute of Electrical and Electronics Engineers (IEEE), Dec 2008.
The definitive version is available at https://doi.org/10.1109/WSC.2008.4736109
Department(s)
Engineering Management and Systems Engineering
Keywords and Phrases
Markov Processes; Function Approximation; Learning (Artificial Intelligence)
Document Type
Article - Conference proceedings
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2008 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Publication Date
01 Dec 2008