Abstract

The variance-penalized metric in Markov decision processes (MDPs) seeks to maximize the average reward minus a scalar time the variance of rewards. in this paper, our goal is to study the same metric in the context of the semi-Markov decision process (SMDP). in the SMDP, unlike the MDP, the time spent in each transition is not identical and may in fact be a random variable. We first develop an expression for the variance of rewards in the SMDPs, and then formulate the VP-SMDP. Our interest here is in solving the problem without generating the underlying transition probabilities of the Markov chains. We propose the use of two stochastic search techniques, namely simultaneous perturbation and learning automata, to solve the problem; these techniques use stochastic policies and can be used within simulators, thereby avoiding the generation of the transition probabilities. © 2011 IEEE.

Recommended Citation

A. Gosavi and M. Purohit, "Stochastic Policy Search for Variance-penalized Semi-Markov Control," Proceedings - Winter Simulation Conference, pp. 2860 - 2871, article no. 6147989, Institute of Electrical and Electronics Engineers, Dec 2011.

The definitive version is available at https://doi.org/10.1109/WSC.2011.6147989

Department(s)

Engineering Management and Systems Engineering

International Standard Book Number (ISBN)

978-145772108-3

International Standard Serial Number (ISSN)

0891-7736

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Dec 2011

Download

Full Text Link

Included in

Operations Research, Systems Engineering and Industrial Engineering Commons

COinS

Engineering Management and Systems Engineering Faculty Research & Creative Works

Stochastic Policy Search for Variance-penalized Semi-Markov Control

Abstract

Recommended Citation

Department(s)

International Standard Book Number (ISBN)

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Included in

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations

Engineering Management and Systems Engineering Faculty Research & Creative Works

Stochastic Policy Search for Variance-penalized Semi-Markov Control

Author

Abstract

Recommended Citation

Department(s)

International Standard Book Number (ISBN)

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Included in

Share

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations