Convergence of Recurrent Neuro-Fuzzy Value-Gradient Learning with and Without an Actor

Seaar Al-Dabooni
Donald C. Wunsch, Missouri University of Science and TechnologyFollow

Abstract

In recent years, a gradient of the $n$-step temporal-difference [TD(λ)] learning has been developed to present an advanced adaptive dynamic programming (ADP) algorithm, called value-gradient learning [VGL(λ)]. In this paper, we improve the VGL(λ) architecture, which is called the 'single adaptive actor network [SNVGL(λ)]' because it has only a single approximator function network (critic) instead of dual networks (critic and actor) as in VGL(λ). Therefore, SNVGL(λ) has lower computational requirements when compared to VGL(λ). Moreover, in this paper, a recurrent hybrid neuro-fuzzy (RNF) and a first-order Takagi-Sugeno RNF (TSRNF) are derived and implemented to build the critic and actor networks. Furthermore, we develop the novel study of the theoretical convergence proofs for both VGL(λ) and SNVGL(λ) under certain conditions. In this paper, mobile robot simulation model (model based) is used to solve the optimal control problem for affine nonlinear discrete-time systems. Mobile robot is exposed various noise levels to verify the performance and to validate the theoretical analysis.

Recommended Citation

S. Al-Dabooni and D. C. Wunsch, "Convergence of Recurrent Neuro-Fuzzy Value-Gradient Learning with and Without an Actor," IEEE Transactions on Fuzzy Systems, vol. 28, no. 4, pp. 658 - 672, Institute of Electrical and Electronics Engineers (IEEE), Apr 2020.

The definitive version is available at https://doi.org/10.1109/TFUZZ.2019.2912349

Department(s)

Electrical and Computer Engineering

Research Center/Lab(s)

Center for High Performance Computing Research

Comments

This work was supported in part by the Missouri University of Science and Technology Intelligent Systems Center, the Mary K. Finley Missouri Endowment, the National Science Foundation, the Lifelong Learning Machines Program from DARPA/Microsystems Technology Office, and the Army Research Laboratory (ARL) under Cooperative Agreement Number W911NF-18-2-0260, in part by the Higher Committee for Educational Development (HCED), and in part by the Basra Oil Company (BOC) in Iraq.

Keywords and Phrases

Adaptive Dynamic Programming (ADP); Convergence Analysis; Eligibility Traces; Mobile Robot; Recurrent Neuro-Fuzzy (RNF); Takagi-Sugeno (T-S) Neuro-Fuzzy

International Standard Serial Number (ISSN)

1063-6706

Document Type

Article - Journal

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

01 Apr 2020

Electrical and Computer Engineering Faculty Research & Creative Works

Convergence of Recurrent Neuro-Fuzzy Value-Gradient Learning with and Without an Actor

Abstract

Recommended Citation

Department(s)

Research Center/Lab(s)

Comments

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations

Electrical and Computer Engineering Faculty Research & Creative Works

Convergence of Recurrent Neuro-Fuzzy Value-Gradient Learning with and Without an Actor

Author

Abstract

Recommended Citation

Department(s)

Research Center/Lab(s)

Comments

Keywords and Phrases

International Standard Serial Number (ISSN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations