HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Ensemble clustering based on evidence extracted from the co-association matrix

Abstract

The evidence accumulation model is an approach for collecting the information of base partitions in a clustering ensemble method, and can be viewed as a kernel transformation from the original data space to a co-association matrix. However, cluster structure information may be partially lost in this transformation; hence, some methods proposed in the literature try to find the lost information and return it to the ensemble process. In this paper, an interesting phenomenon is introduced: remove some evidences from the co-association matrix, which can result in more accurate clustering results. The intuitive explanation for this is that some evidences in the original co-association matrix could be noise, with negative effects on the final clustering. However, it is difficult to detect those evidences practically, let alone remove them from the matrix. To remedy this problem, we remove multiple level evidences having low occurrence frequencies, because negative evidences do not normally occur regularly in the base partitions. Subsequently, we use normalized cut to achieve multiple clustering results. To discriminate the optimal ensemble result, an internal validity index, which uses only the co-association matrix, is specially designed for the clustering ensemble. The experimental results on 16 datasets demonstrate that the proposed scheme outperforms some state-of-the-art clustering ensemble approaches.

Benchmarks

BenchmarkMethodologyMetrics
clustering-ensemble-on-ionosphereNegMM
Purity: 0.83
clustering-ensemble-on-pathbasedNegMM
Purity: 0.98

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp
Ensemble clustering based on evidence extracted from the co-association matrix | Papers | HyperAI