Abstract

This paper investigates a computational model that allows for systematic comparison of phenotype data with genotype (Single Nucleotide Polymorphisms (SNPs)) data based on machine learning techniques to identify discriminant genotype markers associated with the phenotypic subgroups. The proposed discriminant SNP identifier model is empirically evaluated using Autism Spectrum Disorder (ASD) simplex sample. Six phenotype markers were selected to cluster the sample in a hexagonal lattice format yielding five multidimensional subgroups based on extremities of the phenotype markers. The SNP selection model includes random subspace selection of SNPs in conjunction with feature selection algorithms to determine which set of SNPs were discriminant among these five subgroups. This yielded a set of SNPs that attained a mean ROC performance of 95% using a Support Vector Machine prediction model. Biological analysis of these SNPs and associated genes across the subgroups is presented to examine their potential clinical significance.

Department(s)

Electrical and Computer Engineering

Comments

Simons Foundation Autism Research Initiative, Grant None

Keywords and Phrases

autism spectrum disorder; clustering; feature selection; SNP analysis

International Standard Book Number (ISBN)

978-172811462-0

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

© 2023 Institute of Electrical and Electronics Engineers, All rights reserved.

Publication Date

01 Jul 2019

Share

 
COinS