Abstract
The importance of gene expression data in cancer diagnosis and treatment by now has been widely recognized by cancer researchers in recent years. However, one of the major challenges in the computational analysis of such data is the curse of dimensionality, due to the overwhelming number of measures of gene expression levels versus the small number of samples. Here, we use a two-step method to reduce the dimension of gene expression data. At first, we extract a subset of genes based on the statistical characteristics of their corresponding gene expression measurements. For further dimensionality reduction, we then apply diffusion maps, which interpret the eigenfunctions of Markov matrices as a system of coordinates on the original data set in order to obtain efficient representation of data geometric descriptions, to the reduced data. A neural network clustering theory, Fuzzy ART, is applied to the resulting data to generate clusters of cancer samples. Experimental results on the small round blue-cell tumor (SRBCT) data set, compared with other widely-used clustering algorithms, demonstrate the effectiveness of our proposed method in addressing multidimensional gene expression data.
Recommended Citation
R. Xu et al., "Clustering of High-Dimensional Gene Expression Data with Feature Filtering Methods and Diffusion Maps," Proceedings of the International Conference on Biomedical Engineering and Informatics, 2008, Institute of Electrical and Electronics Engineers (IEEE), May 2008.
The definitive version is available at https://doi.org/10.1109/BMEI.2008.256
Meeting Name
International Conference on Biomedical Engineering and Informatics, 2008
Department(s)
Electrical and Computer Engineering
Second Department
Computer Science
Sponsor(s)
Mary K. Finley Missouri Endowment
National Science Foundation (U.S.)
Keywords and Phrases
Eigenvalues and Eigenfunctions; Medical Computing; Patient Diagnosis; Pattern Clustering; Cancer; Fuzzy systems; Genetics; Markov processes; Tumors
Document Type
Article - Conference proceedings
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2008 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Publication Date
01 May 2008