Abstract
Understanding the performance and validity of clustering algorithms is both challenging and crucial, particularly when clustering must be done online. Until recently, most validation methods have relied on batch calculation and have required considerable human expertise in their interpretation. Improving real-time performance and interpretability of cluster validation, therefore, continues to be an important theme in unsupervised learning. Building upon previous work on incremental cluster validity indices (iCVIs), this paper introduces the Meta- iCVI as a tool for explainable and concise labeling of partition quality in online clustering. Leveraging a time-series classifier and data-fusion techniques, the Meta- iCVI combines the outputs of multiple iCVIs to produce a streaming label of either 'over', 'under', or 'correctly' partitioned. Experiments were conducted on generalized synthetic and real-world data sets to demonstrate the efficacy and application of this method. Results of 100% accuracy were achieved in labeling partition quality on real-world data sets including MNIST and FLIR ADAS, demonstrating that the Meta- iCVI is a powerful and efficient tool for classifying partition quality in a variety of conditions. Its introduction should empower new and more efficient streaming clustering techniques. Additionally, we believe this to be the first implementation of an ensemble iCVI metric and the first time iCVI validation performance has been evaluated on randomized sample presentation.
Recommended Citation
N. M. Melton et al., "Meta-ICVI: Ensemble Validity Metrics For Concise Labeling Of Correct, Under- Or Over-Partitioning In Streaming Clustering," IEEE Access, vol. 12, pp. 11114 - 11124, Institute of Electrical and Electronics Engineers, Jan 2024.
The definitive version is available at https://doi.org/10.1109/ACCESS.2023.3346058
Department(s)
Electrical and Computer Engineering
Second Department
Computer Science
Publication Status
Open Access
Keywords and Phrases
classification; Clustering; data streams; explainable AI; iCVI; incremental cluster validity index; online; streaming; time-series; validations
International Standard Serial Number (ISSN)
2169-3536
Document Type
Article - Journal
Document Version
Final Version
File Type
text
Language(s)
English
Rights
© 2024 The Authors, All rights reserved.
Creative Commons Licensing
This work is licensed under a Creative Commons Attribution 4.0 License.
Publication Date
01 Jan 2024