Mathematics and Statistics Faculty Research & Creative Works

Statistical Comparative Analysis and Evaluation of Validation Indices for Clustering Optimization

Thy Nguyen
Jason Viehman
Dacosta Yeboah
Gayla R. Olbricht, Missouri University of Science and TechnologyFollow
Tayo Obafemi-Ajayi

Abstract

Clustering is a relevant exploratory tool for a broad range of machine learning applications as it aids identification of meaningful subgroups. For a given clustering algorithm, multiple partitions can be obtained on the same data set by varying algorithmic parameters. Internal validation indices provide a means to objectively evaluate how well groupings obtained from a clustering configuration partitions the data, since there is no prior labeled data. This work presents a rigorous statistical evaluation framework that analyzes performance of internal validation indices based on correlation with external indices. A synthetic data generator that captures a wide range of complexity is proposed. Evaluation is conducted on a varied set of synthetic data types and real data sets to investigate performance of the indices.

Recommended Citation

T. Nguyen et al., "Statistical Comparative Analysis and Evaluation of Validation Indices for Clustering Optimization," 2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020, pp. 3081 - 3090, Institute of Electrical and Electronics Engineers (IEEE), Dec 2020.

The definitive version is available at https://doi.org/10.1109/SSCI47803.2020.9308412

Meeting Name

2020 IEEE Symposium Series on Computational Intelligence, SSCI (2020: Dec. 1-4, Canberra, ACT, Australia)

Department(s)

Mathematics and Statistics

Research Center/Lab(s)

Center for High Performance Computing Research

Keywords and Phrases

clustering; statistical analysis; validation indices

International Standard Book Number (ISBN)

978-172812547-3

Document Type

Article - Conference proceedings

Document Version

Citation

File Type

text

Language(s)

English

Rights

Publication Date

04 Dec 2020

Link to Full Text

COinS

Mathematics and Statistics Faculty Research & Creative Works

Statistical Comparative Analysis and Evaluation of Validation Indices for Clustering Optimization

Abstract

Recommended Citation

Meeting Name

Department(s)

Research Center/Lab(s)

Keywords and Phrases

International Standard Book Number (ISBN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations

Mathematics and Statistics Faculty Research & Creative Works

Statistical Comparative Analysis and Evaluation of Validation Indices for Clustering Optimization

Author

Abstract

Recommended Citation

Meeting Name

Department(s)

Research Center/Lab(s)

Keywords and Phrases

International Standard Book Number (ISBN)

Document Type

Document Version

File Type

Language(s)

Rights

Publication Date

Share

Search

Browse

Faculty Gallery

Author Corner

Related Content

Useful Links

Article Locations