Keywords and Phrases
Big Data; Classification; Deep Learning
"In this digital age, big-data sets are commonly found in the field of healthcare, manufacturing and others where sustainable analysis is necessary to create useful information. Big-data sets are often characterized by high-dimensionality and massive sample size. High dimensionality refers to the presence of unwanted dimensions in the data where challenges such as noise, spurious correlation and incidental endogeneity are observed. Massive sample size, on the other hand, introduces the problem of heterogeneity because complex and unstructured data types must analyzed. To mitigate the impact of these challenges while considering the application of classification, a two step analysis approach is introduced where the first step is that of dimension-reduction and the second step is classification using deep neural networks (DNN).
First, multi-step dimension-reduction approaches are developed where both linear and nonlinear relationships among dimensions are considered. The dimensions are first grouped, and each group is then transformed. These two stages are repeated for multiple steps to achieve sufficient dimension-reduction. Novel singular value-based criterions are defined to create groupings, determine the number of steps, and control information loss. Mitigation of noisy dimensions with reduced computational complexity is demonstrated.
Subsequently in the second part of the dissertation, it is shown that heterogeneity can increase generalization error in DNN-based classification. To mitigate this effect, novel learning frameworks are developed where generalization cost is approximated and then minimized during learning. Direct error-driven learning scheme and an additional variables-based distributed learning regime are introduced to train the DNN in the presence of big-data challenges. Efficient DNN learning is demonstrated while mitigating vanishing gradients and challenges due to sparse non-convex optimization. Simulation results using several big-data sets validate theoretical results"--Abstract, page iv.
Sarangapani, Jagannathan, 1965-
Wunsch, Donald C.
Zawodniok, Maciej Jan, 1975-
Stanley, R. Joe
Samaranayake, V. A.
Electrical and Computer Engineering
Ph. D. in Computer Engineering
National Science Foundation (U.S.). Industry/University Cooperative Research Centers Program
Missouri University of Science and Technology. Intelligent Systems Center
Intelligent Systems Center
Missouri University of Science and Technology
Journal article titles appearing in thesis/dissertation
- A hierarchical dimension reduction approach for big data with application to fault diagnostics
- A multi-step nonlinear dimension-reduction approach with applications to big data
- Direct error driven learning for deep neural networks with applications to big data
- A game theoretic approach for addressing domain-shift with applications to big-data classification
- Distributed minimax learning with deep sparse neural network with applications to high dimensional classification
xv, 206 pages
© 2019 Krishnan Raghavan, All rights reserved.
Dissertation - Open Access
Electronic OCLC #
Raghavan, Krishnan, "Deep neural network learning-based classifier design for big-data analytics" (2019). Doctoral Dissertations. 2895.