Keywords and Phrases
Actor-Critic; Adaptive-Critic; Airline Revenue Management; Reinforcement Learning
"This thesis presents a new actor-critic algorithm from the domain of reinforcement learning to solve Markov and semi-Markov decision processes (or problems) in the field of airline revenue management (ARM). The ARM problem is one of control optimization in which a decision-maker must accept or reject a customer based on a requested fare. This thesis focuses on the so-called single-leg version of the ARM problem, which can be cast as a semi-Markov decision process (SMDP). Large-scale Markov decision processes (MDPs) and SMDPs suffer from the curses of dimensionality and modeling, making it difficult to create the transition probability matrices (TPMs) necessary to solve them using traditional methods such as dynamic and linear programming. This thesis seeks to employ an actor-critic algorithm to overcome the challenges found in developing TPMs for large-scale real-world problems. Unlike traditional actor-critic algorithms, where the values of the so-called actor can either become very large or very small, the algorithm developed in this thesis has an updating mechanism that keeps the values of the actor"s iterates bounded in the limit and significantly smaller in magnitude than previous actor-critic algorithms. This allows the algorithm to explore the state space fully and perform better than its traditional counterpart. Numerical experiments conducted show encouraging results with the new algorithm by delivering optimal results on small case MDPs and SMDPs and consistently outperforming an airline industry heuristic, namely EMSR-b, on large-scale ARM problems"--Abstract, page iii.
Murray, Susan L.
Engineering Management and Systems Engineering
M.S. in Systems Engineering
Missouri University of Science and Technology. Intelligent Systems Center
Intelligent Systems Center
Missouri University of Science and Technology
vii, 49 pages
© 2017 Ryan Jacob Lawhead, All rights reserved.
Thesis - Open Access
Electronic OCLC #
Lawhead, Ryan Jacob, "A bounded actor-critic algorithm for reinforcement learning" (2017). Masters Theses. 7740.