Enhancing Supervisory Training Signals with Environmental Reinforcement Learning Using Adaptive Dynamic Programming
Department
Mechanical and Aerospace Engineering
Major
Aerospace Engineering
Research Advisor
Wunsch, Donald C.
Advisor's Department
Electrical and Computer Engineering
Abstract
A method for hybridizing supervised learning with adaptive dynamic programming was developed to increase the speed, quality, and robustness of on-line neural network learning from an imperfect teacher. Reinforcement learning is used to modify and enhance the original supervisory signal before learning occurs. This research describes the method of hybridization and presents a model problem in which a human supervisor teaches a simulated car to drive around a race track. Simulation results show successful learning and improvements in convergence time, error rate, and stability over both component methods alone.
Biography
Niklas Melton is a senior aerospace engineering major from Kansas City, MO. Since coming to MS&T, he has continued to develop and refine his technical interests and skills with a focus on biologically inspired technologies. He is a Student Council representative for the iGEM student design team and is an active member of the Applied Computational Intelligence Lab, where he researches controls applications of neural networks.
Research Category
Engineering
Presentation Type
Oral Presentation
Document Type
Presentation
Award
Engineering oral presentation, First place
Location
Turner Room
Presentation Date
11 Apr 2016, 10:40 am - 11:00 am
Enhancing Supervisory Training Signals with Environmental Reinforcement Learning Using Adaptive Dynamic Programming
Turner Room
A method for hybridizing supervised learning with adaptive dynamic programming was developed to increase the speed, quality, and robustness of on-line neural network learning from an imperfect teacher. Reinforcement learning is used to modify and enhance the original supervisory signal before learning occurs. This research describes the method of hybridization and presents a model problem in which a human supervisor teaches a simulated car to drive around a race track. Simulation results show successful learning and improvements in convergence time, error rate, and stability over both component methods alone.