Masters Theses
Keywords and Phrases
Actor-Critic; Adaptive-Critic; Airline Revenue Management; Reinforcement Learning
Abstract
"This thesis presents a new actor-critic algorithm from the domain of reinforcement learning to solve Markov and semi-Markov decision processes (or problems) in the field of airline revenue management (ARM). The ARM problem is one of control optimization in which a decision-maker must accept or reject a customer based on a requested fare. This thesis focuses on the so-called single-leg version of the ARM problem, which can be cast as a semi-Markov decision process (SMDP). Large-scale Markov decision processes (MDPs) and SMDPs suffer from the curses of dimensionality and modeling, making it difficult to create the transition probability matrices (TPMs) necessary to solve them using traditional methods such as dynamic and linear programming. This thesis seeks to employ an actor-critic algorithm to overcome the challenges found in developing TPMs for large-scale real-world problems. Unlike traditional actor-critic algorithms, where the values of the so-called actor can either become very large or very small, the algorithm developed in this thesis has an updating mechanism that keeps the values of the actor"s iterates bounded in the limit and significantly smaller in magnitude than previous actor-critic algorithms. This allows the algorithm to explore the state space fully and perform better than its traditional counterpart. Numerical experiments conducted show encouraging results with the new algorithm by delivering optimal results on small case MDPs and SMDPs and consistently outperforming an airline industry heuristic, namely EMSR-b, on large-scale ARM problems"--Abstract, page iii.
Advisor(s)
Gosavi, Abhijit
Murray, Susan L.
Committee Member(s)
Sun, Zeyi
Department(s)
Engineering Management and Systems Engineering
Degree Name
M.S. in Systems Engineering
Sponsor(s)
Missouri University of Science and Technology. Intelligent Systems Center
Research Center/Lab(s)
Intelligent Systems Center
Publisher
Missouri University of Science and Technology
Publication Date
Summer 2017
Pagination
vii, 49 pages
Note about bibliography
Includes bibliographical references (pages 46-48).
Rights
© 2017 Ryan Jacob Lawhead, All rights reserved.
Document Type
Thesis - Open Access
File Type
text
Language
English
Thesis Number
T 11347
Electronic OCLC #
1041856644
Recommended Citation
Lawhead, Ryan Jacob, "A bounded actor-critic algorithm for reinforcement learning" (2017). Masters Theses. 7740.
https://scholarsmine.mst.edu/masters_theses/7740
Included in
Artificial Intelligence and Robotics Commons, Operations Research, Systems Engineering and Industrial Engineering Commons