Doctoral Dissertations

Keywords and Phrases

Big Data; Classification; Deep Learning

Abstract

"In this digital age, big-data sets are commonly found in the field of healthcare, manufacturing and others where sustainable analysis is necessary to create useful information. Big-data sets are often characterized by high-dimensionality and massive sample size. High dimensionality refers to the presence of unwanted dimensions in the data where challenges such as noise, spurious correlation and incidental endogeneity are observed. Massive sample size, on the other hand, introduces the problem of heterogeneity because complex and unstructured data types must analyzed. To mitigate the impact of these challenges while considering the application of classification, a two step analysis approach is introduced where the first step is that of dimension-reduction and the second step is classification using deep neural networks (DNN).

First, multi-step dimension-reduction approaches are developed where both linear and nonlinear relationships among dimensions are considered. The dimensions are first grouped, and each group is then transformed. These two stages are repeated for multiple steps to achieve sufficient dimension-reduction. Novel singular value-based criterions are defined to create groupings, determine the number of steps, and control information loss. Mitigation of noisy dimensions with reduced computational complexity is demonstrated.

Subsequently in the second part of the dissertation, it is shown that heterogeneity can increase generalization error in DNN-based classification. To mitigate this effect, novel learning frameworks are developed where generalization cost is approximated and then minimized during learning. Direct error-driven learning scheme and an additional variables-based distributed learning regime are introduced to train the DNN in the presence of big-data challenges. Efficient DNN learning is demonstrated while mitigating vanishing gradients and challenges due to sparse non-convex optimization. Simulation results using several big-data sets validate theoretical results"--Abstract, page iv.

Advisor(s)

Sarangapani, Jagannathan, 1965-

Committee Member(s)

Wunsch, Donald C.
Zawodniok, Maciej Jan, 1975-
Stanley, R. Joe
Samaranayake, V. A.

Department(s)

Electrical and Computer Engineering

Degree Name

Ph. D. in Computer Engineering

Sponsor(s)

National Science Foundation (U.S.). Industry/University Cooperative Research Centers Program
Missouri University of Science and Technology. Intelligent Systems Center

Comments

This research was supported in part by NSF I/UCRC award IIP 1134721 and Intelligent Systems Center.

Research Center/Lab(s)

Intelligent Systems Center

Publisher

Missouri University of Science and Technology

Publication Date

Spring 2019

Journal article titles appearing in thesis/dissertation

  • A hierarchical dimension reduction approach for big data with application to fault diagnostics
  • A multi-step nonlinear dimension-reduction approach with applications to big data
  • Direct error driven learning for deep neural networks with applications to big data
  • A game theoretic approach for addressing domain-shift with applications to big-data classification
  • Distributed minimax learning with deep sparse neural network with applications to high dimensional classification

Pagination

xv, 206 pages

Note about bibliography

Includes bibliographic references.

Rights

© 2019 Krishnan Raghavan, All rights reserved.

Document Type

Dissertation - Open Access

File Type

text

Language

English

Thesis Number

T 11722

Electronic OCLC #

1164805575

Share

 
COinS