The rise of on-demand healthcare and the unprecedented growth of electronic health records has given rise to big data opportunities and data analysis using machine learning. The massive and disparate data management using conventional databases is incredibly challenging and expensive to manage. It often requires specialized analytical tools for developing advanced data-driven capabilities and performing data analytics. This paper explores the capability of an open-source framework 'Apache Spark' capable of processing large amounts of data on clusters of nodes to analyze Big data and integrate technologies to provide decision support systems in healthcare settings. Next, we propose machine learning models on top of Apache Spark to expedite the decision-making in allocating organs such as kidney selection for the right candidate, thus increasing donor utilization by locating a recipient within the allotted time. The proposed models help in identifying waitlisted candidates willing to accept kidneys that may otherwise be discarded.


Computer Science

Second Department

Engineering Management and Systems Engineering

Research Center/Lab(s)

Intelligent Systems Center


This work was supported by the Missouri University of Science and Technology and Saint Louis University

Keywords and Phrases

Apache Spark; Big Data; Healthcare; Machine Learning; Organ Procurement

International Standard Serial Number (ISSN)


Document Type

Article - Journal

Document Version

Final Version

File Type





© 2021 The Authors, All rights reserved.

Creative Commons Licensing

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.

Publication Date

18 Jun 2021