Human biogeographical ancestry estimation using genomic information is an important problem with applications in population stratification, admixture mapping, forensic ancestry inference, and in healthcare. Various studies have proposed panels of ancestry informative single nucleotide polymorphisms (SNPs) for distinguishing between widely separated continental populations. There has been limited investigation on identifying SNP panels for sub-continental ancestry prediction, especially given the difficult challenge of identifying SNP markers to distinguish closely associated sub-populations, for instance, within a continent. In this study, we propose an ancestry informative SNP selection algorithm exploiting the concept of random subspace projection using supervised learning. The proposed approach identifies small panels of useful SNPs for subcontinental level ancestry classification. We show results for sub-continental level classification for all five continents in our dataset.
T. Toma et al., "Random Subspace Projection For Predicting Biogeographical Ancestry," Proceedings - 2018 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2018, pp. 1719 - 1725, article no. 8621222, Institute of Electrical and Electronics Engineers, Jan 2019.
The definitive version is available at https://doi.org/10.1109/BIBM.2018.8621222
Electrical and Computer Engineering
Keywords and Phrases
Ancestry Classification; DNA; Random Subspace Projection; Single Chromosome; SNP; SNP Selection
International Standard Book Number (ISBN)
Article - Conference proceedings
© 2023 Institute of Electrical and Electronics Engineers, All rights reserved.
21 Jan 2019