A Multi-Step Nonlinear Dimension-Reduction Approach with Applications to Big Data
Abstract
In this paper, a novel dimension-reduction approach is presented to overcome challenges such as nonlinear relationships, heterogeneity, and noisy dimensions. Initially, the p p attributes in the data are first organized into random groups. Next, to systematically remove redundant and noisy dimensions from the data, each group is independently mapped into a low dimensional space via a parametric mapping. The group-wise transformation parameters are estimated using a low-rank approximation of distance covariance. The transformed attributes are reorganized into groups based on the magnitude of their respective eigenvalues. The group-wise organization and reduction process is performed until a user-defined criterion on eigenvalues is satisfied. In addition, novel procedures are introduced to aggregate the transformation parameters when the data is available in batches. Overall performance is demonstrated with extensive simulation analysis on classification by employing 10 data-sets.
Recommended Citation
R. Krishnan et al., "A Multi-Step Nonlinear Dimension-Reduction Approach with Applications to Big Data," IEEE Transactions on Knowledge and Data Engineering, vol. 31, no. 12, pp. 2249 - 2261, Institute of Electrical and Electronics Engineers (IEEE), Dec 2019.
The definitive version is available at https://doi.org/10.1109/TKDE.2018.2876848
Department(s)
Mathematics and Statistics
Research Center/Lab(s)
Center for High Performance Computing Research
Keywords and Phrases
big-data; classification; dimension-reduction; Distance covariance
International Standard Serial Number (ISSN)
1041-4347
Document Type
Article - Journal
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2021 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Publication Date
01 Dec 2019