Abstract

Actor-critic algorithms are amongst the most well-studied reinforcement learning algorithms that can be used to solve Markov decision processes (MDPs) via simulation. Unfortunately, the parameters of the so-called "actor" in the classical actor-critic algorithm exhibit great volatility - getting unbounded in practice, whence they have to be artificially constrained to obtain solutions in practice. The algorithm is often used in conjunction with Boltzmann action selection, where one may have to use a temperature to get the algorithm to work, but the convergence of the algorithm has only been proved when the temperature equals 1. We propose a new actor-critic algorithm whose actor's parameters are bounded. We present a mathematical proof of the boundedness and test our algorithm on small-scale MDPs for infinite horizon discounted reward. Our algorithm produces encouraging numerical results.

Recommended Citation

A. Gosavi, "How to Rein in the Volatile Actor: A New Bounded Perspective," Procedia Computer Science, vol. 36, pp. 500 - 507, Elsevier, Jan 2014.

The definitive version is available at https://doi.org/10.1016/j.procs.2014.09.030

Department(s)

Engineering Management and Systems Engineering

Publication Status

Open Access

Keywords and Phrases

Actor critics; Adaptive critics; Boundedness; Reinforcement learning; Stochastic policy search

International Standard Serial Number (ISSN)

1877-0509

Document Type

Article - Conference proceedings

Document Version

Final Version

File Type

text

Language(s)

English

Rights

Creative Commons Licensing

This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Publication Date

01 Jan 2014

Download

Full Text Link

Included in

Operations Research, Systems Engineering and Industrial Engineering Commons

COinS

Engineering Management and Systems Engineering Faculty Research & Creative Works

How to Rein in the Volatile Actor: A New Bounded Perspective

Abstract

Recommended Citation

Department(s)

Publication Status

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Creative Commons Licensing

Publication Date

Included in

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations

Engineering Management and Systems Engineering Faculty Research & Creative Works

How to Rein in the Volatile Actor: A New Bounded Perspective

Author

Abstract

Recommended Citation

Department(s)

Publication Status

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Creative Commons Licensing

Publication Date

Included in

Share

Search

Browse

Author Corner

Related Content

Useful Links

Article Locations