An Effective Dimension Reduction Approach to Chinese Document Classification using Genetic Algorithm
Abstract
Different kinds of methods have been proposed in Chinese document classification, while high dimension of feature vector is one of the most significant limits in these methods. In this paper, an important difference is pointed out between Chinese document classification and English document classification. Then an efficient approach is proposed to reduce the dimension of feature vector in Chinese document classification using Genetic Algorithm. Through merely choosing the set of much more "important" features, the proposed method significantly reduces the number of Chinese feature words. Experiments combining with several relative studies show that the proposed method has great effect on dimension reduction with little loss in correctly classified rate.
Recommended Citation
Z. Guo et al., "An Effective Dimension Reduction Approach to Chinese Document Classification using Genetic Algorithm," Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 5552 LNCS, no. PART 2, pp. 480 - 489, Springer, Jan 2009.
The definitive version is available at https://doi.org/10.1007/978-3-642-01510-6_55
Department(s)
Computer Science
Keywords and Phrases
Chinese Document Classification; Dimension Reduction; Genetic Algorithm (GA); Support Vector Machine(SVM)
International Standard Serial Number (ISSN)
0302-9743
Document Type
Book - Chapter
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2009 Springer, All rights reserved.
Publication Date
01 Jan 2009