An Information-Theoretic-Cluster Visualization for Self-Organizing Maps
Abstract
Improved data visualization will be a significant tool to enhance cluster analysis. In this paper, an information-theoretic-based method for cluster visualization using self-organizing maps (SOMs) is presented. The information-theoretic visualization (IT-vis) has the same structure as the unified distance matrix, but instead of depicting Euclidean distances between adjacent neurons, it displays the similarity between the distributions associated with adjacent neurons. Each SOM neuron has an associated subset of the data set whose cardinality controls the granularity of the IT-vis and with which the first- and second-order statistics are computed and used to estimate their probability density functions. These are used to calculate the similarity measure, based on Renyi's quadratic cross entropy and cross information potential (CIP). The introduced visualizations combine the low computational cost and kernel estimation properties of the representative CIP and the data structure representation of a single-linkage-based grouping algorithm to generate an enhanced SOM-based visualization. The visual quality of the IT-vis is assessed by comparing it with other visualization methods for several real-world and synthetic benchmark data sets. Thus, this paper also contains a significant literature survey. The experiments demonstrate the IT-vis cluster revealing capabilities, in which cluster boundaries are sharply captured. Additionally, the information-theoretic visualizations are used to perform clustering of the SOM. Compared with other methods, IT-vis of large SOMs yielded the best results in this paper, for which the quality of the final partitions was evaluated using external validity indices.
Recommended Citation
L. E. Brito Da Silva and D. C. Wunsch, "An Information-Theoretic-Cluster Visualization for Self-Organizing Maps," IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 6, pp. 2595 - 2613, Institute of Electrical and Electronics Engineers (IEEE), Jun 2018.
The definitive version is available at https://doi.org/10.1109/TNNLS.2017.2699674
Department(s)
Electrical and Computer Engineering
Research Center/Lab(s)
Intelligent Systems Center
Second Research Center/Lab
Center for High Performance Computing Research
Keywords and Phrases
Benchmarking; Cluster analysis; Conformal mapping; Data visualization; Information theory; Neurons; Probability density function; Quality control; Visualization; Cluster visualization; Computational costs; External validities; Information potential; Second order statistics; Self organizing maps(SOMs); Synthetic benchmark; Visualization method; Self organizing maps; Clustering; Entropy; Review; Self-organizing feature maps; Survey
International Standard Serial Number (ISSN)
2162-237X; 2162-2388
Document Type
Article - Journal
Document Version
Citation
File Type
text
Language(s)
English
Rights
© 2018 Institute of Electrical and Electronics Engineers (IEEE), All rights reserved.
Publication Date
01 Jun 2018