This paper investigates a computational model that allows for systematic comparison of phenotype data with genotype (Single Nucleotide Polymorphisms (SNPs)) data based on machine learning techniques to identify discriminant genotype markers associated with the phenotypic subgroups. The proposed discriminant SNP identifier model is empirically evaluated using Autism Spectrum Disorder (ASD) simplex sample. Six phenotype markers were selected to cluster the sample in a hexagonal lattice format yielding five multidimensional subgroups based on extremities of the phenotype markers. The SNP selection model includes random subspace selection of SNPs in conjunction with feature selection algorithms to determine which set of SNPs were discriminant among these five subgroups. This yielded a set of SNPs that attained a mean ROC performance of 95% using a Support Vector Machine prediction model. Biological analysis of these SNPs and associated genes across the subgroups is presented to examine their potential clinical significance.
J. Zhao et al., "Genotype Combinations Linked To Phenotype Subgroups In Autism Spectrum Disorders," 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology, CIBCB 2019, article no. 8791461, Institute of Electrical and Electronics Engineers, Jul 2019.
The definitive version is available at https://doi.org/10.1109/CIBCB.2019.8791461
Electrical and Computer Engineering
Keywords and Phrases
autism spectrum disorder; clustering; feature selection; SNP analysis
International Standard Book Number (ISBN)
Article - Conference proceedings
© 2023 Institute of Electrical and Electronics Engineers, All rights reserved.
01 Jul 2019