Abstract

"The semi-Markov decision process (SMDP) is a variant of the Markov decision process (MOP). This dissertation work focuses on the application of SMDPs to disaster response management and to maintenance management. Average and discounted reward are two popular performance metrics for MDPs/SMDPs. While both dynamic programming (DP) methods, i.e., value iteration and policy iteration, are commonly used to solve MDPs/SMDPs, value iteration is easier to apply than policy iteration. The existing value iteration algorithms for average reward SMDPs have some noteworthy limitations, which are sought to be overcome in this work. Reinforcement learning (RL) techniques, which are also studied in this work, are used when DP methods break down due to the curse of dimensionality. The work in this dissertation is divided into two essays.

The first essay is on disaster response management. A comprehensive risk-based emergency model for a post-earthquake scenario, which includes domino-effect phenomena and is based on SMDPs, is developed. The goal is to minimize the rate of risk posed to the people affected after an earthquake. A value iteration algorithm for SMDPs, based on the stochastic shortest path approach, is developed as a solution technique. The proposed algorithm overcomes the limitations of the existing value iteration algorithms. Numerical results generated by the proposed algorithm are very encouraging. Convergence for the algorithm also has been established.

In the second essay, a new DP algorithm based on value iteration and two new RL algorithms (i-SMART and a model-building adaptive critic) are proposed. The new algorithms are used to solve a variety of preventive maintenance (PM) problems and generate encouraging computational results. Scheduling the time interval for PM is very crucial in a total productive maintenance program. Further, the proposed DP algorithm overcomes the limitations of the existing value iterations algorithms"--Abstract, page iii.

Advisor(s)

Gosavi, Abhijit

Committee Member(s)

Murray, Susan L.
Qin, Ruwen
Guardiola, Ivan
Le, Vy Khoi

Department(s)

Engineering Management and Systems Engineering

Degree Name

Ph. D. in Engineering Management

Publisher

Missouri University of Science and Technology

Publication Date

2013

Pagination

xi, 101 pages

Note about bibliography

Includes bibliographical references (pages 94-100).

Rights

Document Type

Dissertation - Open Access

File Type

text

Language

English

Subject Headings

Dynamic programming
Reinforcement learning
Emergency management -- Mathematical models
Maintenance -- Mathematical models

Thesis Number

T 10651

Print OCLC #

922574026

Electronic OCLC #

922574385

Recommended Citation

Ghosh, Shuva, "Two essays on dynamic programming and reinforcement learning" (2013). Doctoral Dissertations. 2431.
https://scholarsmine.mst.edu/doctoral_dissertations/2431

Download

Included in

Operations Research, Systems Engineering and Industrial Engineering Commons

COinS

Doctoral Dissertations

Two essays on dynamic programming and reinforcement learning

Abstract

Advisor(s)

Committee Member(s)

Department(s)

Degree Name

Publisher

Publication Date

Pagination

Note about bibliography

Rights

Document Type

File Type

Language

Subject Headings

Thesis Number

Print OCLC #

Electronic OCLC #

Recommended Citation

Included in

Search

Browse

Author Corner

Useful Links

Dissertation Locations

Doctoral Dissertations

Two essays on dynamic programming and reinforcement learning

Author

Abstract

Advisor(s)

Committee Member(s)

Department(s)

Degree Name

Publisher

Publication Date

Pagination

Note about bibliography

Rights

Document Type

File Type

Language

Subject Headings

Thesis Number

Print OCLC #

Electronic OCLC #

Recommended Citation

Included in

Share

Search

Browse

Author Corner

Useful Links

Dissertation Locations