Abstract

Cardiovascular disease (CVD) is a leading cause of global mortality, highlighting the need for accurate diagnostic methods. This study benchmarks centralized and federated learning (FL) algorithms for heart disease binary classification using the UCI dataset, which includes 920 patient records from four hospitals in the USA, Hungary, and Switzerland. Our benchmark is supported by Shapley-value as well as Local Interpretable Model-agnostic Explanations (LIME) interpretability analyses to quantify feature importance for classification. In the centralized setup, various classification algorithms are trained on pooled data, with the Naive Bayes classifier achieving the highest test accuracy of 81.1%. Further, FL algorithms with four clients (hospitals) and various aggregation mechanisms are explored, leveraging the dataset's natural partition to enhance privacy without compromising accuracy. Federating logistic regression achieves a top test accuracy of 78.2%. Our interpretability analysis aligns with existing medical knowledge of heart disease indicators. Overall, this study establishes a benchmark for efficient and interpretable prescreening tools of heart disease while maintaining patients' privacy.

Department(s)

Electrical and Computer Engineering

Keywords and Phrases

Centralized learning; Federated learning; Heart disease classification; Interpretability; Shapley values learning

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2025 Institute of Electrical and Electronics Engineers, All rights reserved.

Publication Date

01 Jan 2025

Share

 
COinS