CRLEDD: Regularized Causalities Learning for Early Detection of Diseases using Electronic Health Record (EHR) Data
Abstract
The availability of Electronic Health Records (EHR) in health care settings has provided tremendous opportunities for early disease detection. While many supervised learning models have been adopted for EHR-based disease early detection, the ill-posed inverse problem in the parameter learning has imposed a significant challenge on improving the accuracy of these algorithms. In this paper, we propose CRLEDD - Causality-Regularized Learning for Early Detection of Disease, an algorithm to improve the performance of Linear Discriminant Analysis (LDA) on top of diagnosis-frequency vector data representation. While most existing regularization methods exploit sparsity regularization to improve detection performance, CRLEDD provides a unique perspective by ensuring positive semi-definiteness of the sparsified precision matrix used in LDA which is different from the regular regularization method (e.g., L2 regularization). To achieve this goal, CRLEDD employs Graphical Lasso to estimate the precision matrix in the ill-posed settings for enhanced accuracy of LDA classifiers. We perform extensive evaluation of CRLEDD using a large-scale real-world EHR dataset to predict mental health disorders (e.g., depression and anxiety) of college students from 10 universities in the U.S. We compare CRLEDD with other regularized LDA and downstream classifiers. The result shows that CRLEDD outperforms all baselines in terms of accuracy and F1 scores.
Recommended Citation
J. Bian et al., "CRLEDD: Regularized Causalities Learning for Early Detection of Diseases using Electronic Health Record (EHR) Data," IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 5, no. 4, pp. 541 - 553, article no. 9163317, Institute of Electrical and Electronics Engineers (IEEE), Aug 2021.
The definitive version is available at https://doi.org/10.1109/TETCI.2020.3010017
Department(s)
Engineering Management and Systems Engineering
Keywords and Phrases
Classification Algorithms; Detection Algorithms; Linear Discriminant Analysis
International Standard Serial Number (ISSN)
2471-285X
Document Type
Article - Journal
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2021 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Publication Date
01 Aug 2021