Engineering Management and Systems Engineering Faculty Research & Creative Works

A Bounded Actor-Critic Reinforcement Learning Algorithm Applied to Airline Revenue Management

Ryan J. Lawhead
Abhijit Gosavi, Missouri University of Science and TechnologyFollow

Abstract

Reinforcement Learning (RL) is an artificial intelligence technique used to solve Markov and semi-Markov decision processes. Actor critics form a major class of RL algorithms that suffer from a critical deficiency, which is that the values of the so-called actor in these algorithms can become very large causing computer overflow. In practice, hence, one has to artificially constrain these values, via a projection, and at times further use temperature-reduction tuning parameters in the popular Boltzmann action-selection schemes to make the algorithm deliver acceptable results. This artificial bounding and temperature reduction, however, do not allow for full exploration of the state space, which often leads to sub-optimal solutions on large-scale problems. We propose a new actor—critic algorithm in which (i) the actor's values remain bounded without any projection and (ii) no temperature-reduction tuning parameter is needed. The algorithm also represents a significant improvement over a recent version in the literature, where although the values remain bounded they usually become very large in magnitude, necessitating the use of a temperature-reduction parameter. Our new algorithm is tested on an important problem in an area of management science known as airline revenue management, where the state-space is very large. The algorithm delivers encouraging computational behavior, outperforming a well-known industrial heuristic called EMSR-b on industrial data.

Recommended Citation

R. J. Lawhead and A. Gosavi, "A Bounded Actor-Critic Reinforcement Learning Algorithm Applied to Airline Revenue Management," Engineering Applications of Artificial Intelligence, vol. 82, pp. 252 - 262, Elsevier Ltd, Jun 2019.

The definitive version is available at https://doi.org/10.1016/j.engappai.2019.04.008

Department(s)

Engineering Management and Systems Engineering

Comments

The authors would like to gratefully acknowledge the Intelligent Systems Cluster at Missouri University of Science and Technology, United States for partially funding this research.

Keywords and Phrases

Actor critics; Airline revenue management; Reinforcement learning

International Standard Serial Number (ISSN)

0952-1976

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Jun 2019

Link to Full Text

COinS

Engineering Management and Systems Engineering Faculty Research & Creative Works

A Bounded Actor-Critic Reinforcement Learning Algorithm Applied to Airline Revenue Management

Abstract

Recommended Citation

Department(s)

Comments

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations

Engineering Management and Systems Engineering Faculty Research & Creative Works

A Bounded Actor-Critic Reinforcement Learning Algorithm Applied to Airline Revenue Management

Author

Abstract

Recommended Citation

Department(s)

Comments

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations