Abstract

"The problem of missing data and imputation have been widely discussed amongst specialists. However, many data scientists and applied statisticians fail to appropriately consider this issue. Often, it seems intuitive to discard observations containing missing data or simply to substitute means. This can lead to disastrous consequences, particularly in an era of exponentially increasing data volumes. In the following, we show how inappropriate handling of missing data and an insufficient analysis of the censoring mechanism can lead to a bias, overconfidence in the estimation of parameters, could challenge the reproducibility of obtained results, and may distort the structure of the dataset."--Background.

Meeting Name

Military Health System Research Symposium, MHSRS 2020

Department(s)

Mathematics and Statistics

Second Department

Electrical and Computer Engineering

Research Center/Lab(s)

Intelligent Systems Center

Second Research Center/Lab

Center for High Performance Computing Research

Comments

This research was sponsored by the Missouri University of Science and Technology Mary K. Finley Endowment and Intelligent Systems Center; the Army Research Laboratory (ARL) and the Lifelong Learning Machines program from DARPA/MTO, and it was accomplished under Cooperative Agreement Number W911NF-18-2-0260.

This abstract was accepted for a poster presentation at the Military Health System Research Symposium August 2020.

Document Type

Poster

Document Version

Citation

File Type

text

Language(s)

English

Share

 
COinS