Principal Component Analysis as an Integral Part of Data Mining in Health Informatics

Abstract

Linear and logistic regression are well-known data mining techniques, however, their ability to deal with interdependent variables is limited. Principal component analysis (PCA) is a prevalent data reduction tool that both transforms the data orthogonally and reduces its dimensionality. In this paper we explore an adaptive hybrid approach where PCA can be used in conjunction with logistic regression to yield models which have both a better fit and a reduced set of factors than those produced by just the regression analysis. We will use example dataset from HealthData.gov to demonstrate the simplicity, applicability and usability of our approach.

Meeting Name

31st International Conference on Computers and Their Applications, CATA 2016 (2016: Apr. 4-6, Las Vegas, NV)

Department(s)

Computer Science

Keywords and Phrases

Big data; Data handling; Data mining; Data reduction; Regression analysis; Data analytics; Health informatics; Hybrid approach; Integral part; Logistic regressions; Yield models; Principal component analysis; Big data analytics; Healthcare analytics

International Standard Book Number (ISBN)

978-1-5108-2252-8

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2016 International Society for Computers and Their Applications (ISCA), All rights reserved.

Publication Date

01 Apr 2016

This document is currently not available here.

Share

 
COinS