A Hierarchical Dimension Reduction Approach for Big Data with Application to Fault Diagnostics


About four zetta bytes of data, which falls into the category of big data, is generated by complex manufacturing systems annually. Big data can be utilized to improve the efficiency of an aging manufacturing system, provided, several challenges are handled. In this paper, a novel methodology is presented to detect faults in manufacturing systems while overcoming some of these challenges. Specifically, a generalized distance measure is proposed in conjunction with a novel hierarchical dimension reduction (HDR) approach. It is shown that the HDR can tackle challenges that are frequently observed during distance calculation in big data scenarios, such as norm concentration, redundant dimensions, and a non-invertible correlation matrices. Subsequently, a probabilistic methodology is developed for isolation and detection of faults. Here, Edgeworth expansion based expressions are derived to approximate the density function of the data. The performance of the dimension reduction methodology is demonstrated to be efficient with simulation results involving the use of big data sets. It is shown that HDR is able to explain almost 90% of the total information. Furthermore, the proposed dimension reduction methodology is seen to outperform standard dimension reduction approaches and is able to improve the performance of standard classification methodologies in high dimensional scenarios.


Mathematics and Statistics

Second Department

Electrical and Computer Engineering

Research Center/Lab(s)

Intelligent Systems Center

Second Research Center/Lab

Center for High Performance Computing Research


This research was supported in part by NSF I/UCRC award IIP 1134721 and Intelligent Systems Center.

Keywords and Phrases

Big Data; Classification; Distance Measure; Edgeworth Expansion; Fault Diagnosis; Mahalanobis Distance

International Standard Serial Number (ISSN)


Document Type

Article - Journal

Document Version


File Type





© 2019 Elsevier Inc., All rights reserved.

Publication Date

01 Dec 2019