Keywords and Phrases

Actor-Critic; Adaptive-Critic; Airline Revenue Management; Reinforcement Learning

Abstract

"This thesis presents a new actor-critic algorithm from the domain of reinforcement learning to solve Markov and semi-Markov decision processes (or problems) in the field of airline revenue management (ARM). The ARM problem is one of control optimization in which a decision-maker must accept or reject a customer based on a requested fare. This thesis focuses on the so-called single-leg version of the ARM problem, which can be cast as a semi-Markov decision process (SMDP). Large-scale Markov decision processes (MDPs) and SMDPs suffer from the curses of dimensionality and modeling, making it difficult to create the transition probability matrices (TPMs) necessary to solve them using traditional methods such as dynamic and linear programming. This thesis seeks to employ an actor-critic algorithm to overcome the challenges found in developing TPMs for large-scale real-world problems. Unlike traditional actor-critic algorithms, where the values of the so-called actor can either become very large or very small, the algorithm developed in this thesis has an updating mechanism that keeps the values of the actor"s iterates bounded in the limit and significantly smaller in magnitude than previous actor-critic algorithms. This allows the algorithm to explore the state space fully and perform better than its traditional counterpart. Numerical experiments conducted show encouraging results with the new algorithm by delivering optimal results on small case MDPs and SMDPs and consistently outperforming an airline industry heuristic, namely EMSR-b, on large-scale ARM problems"--Abstract, page iii.

Advisor(s)

Gosavi, Abhijit
Murray, Susan L.

Committee Member(s)

Sun, Zeyi

Department(s)

Engineering Management and Systems Engineering

Degree Name

M.S. in Systems Engineering

Sponsor(s)

Missouri University of Science and Technology. Intelligent Systems Center

Research Center/Lab(s)

Intelligent Systems Center

Publisher

Missouri University of Science and Technology

Publication Date

Summer 2017

Pagination

vii, 49 pages

Note about bibliography

Includes bibliographical references (pages 46-48).

Rights

Document Type

Thesis - Open Access

File Type

text

Language

English

Thesis Number

T 11347

Electronic OCLC #

1041856644

Recommended Citation

Lawhead, Ryan Jacob, "A bounded actor-critic algorithm for reinforcement learning" (2017). Masters Theses. 7740.
https://scholarsmine.mst.edu/masters_theses/7740

Download

Included in

Artificial Intelligence and Robotics Commons, Operations Research, Systems Engineering and Industrial Engineering Commons

COinS

Masters Theses

A bounded actor-critic algorithm for reinforcement learning

Keywords and Phrases

Abstract

Advisor(s)

Committee Member(s)

Department(s)

Degree Name

Sponsor(s)

Research Center/Lab(s)

Publisher

Publication Date

Pagination

Note about bibliography

Rights

Document Type

File Type

Language

Thesis Number

Electronic OCLC #

Recommended Citation

Included in

Search

Browse

Author Corner

Useful Links

Thesis Locations

Masters Theses

A bounded actor-critic algorithm for reinforcement learning

Author

Keywords and Phrases

Abstract

Advisor(s)

Committee Member(s)

Department(s)

Degree Name

Sponsor(s)

Research Center/Lab(s)

Publisher

Publication Date

Pagination

Note about bibliography

Rights

Document Type

File Type

Language

Thesis Number

Electronic OCLC #

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Useful Links

Thesis Locations