Command Palette
Search for a command to run...
Guy Shiran Daphna Weinshall

Abstract
The clustering of unlabeled raw images is a daunting task, which has recently been approached with some success by deep learning methods. Here we propose an unsupervised clustering framework, which learns a deep neural network in an end-to-end fashion, providing direct cluster assignments of images without additional processing. Multi-Modal Deep Clustering (MMDC), trains a deep network to align its image embeddings with target points sampled from a Gaussian Mixture Model distribution. The cluster assignments are then determined by mixture component association of image embeddings. Simultaneously, the same deep network is trained to solve an additional self-supervised task of predicting image rotations. This pushes the network to learn more meaningful image representations that facilitate a better clustering. Experimental results show that MMDC achieves or exceeds state-of-the-art performance on six challenging benchmarks. On natural image datasets we improve on previous results with significant margins of up to 20% absolute accuracy points, yielding an accuracy of 82% on CIFAR-10, 45% on CIFAR-100 and 69% on STL-10.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| image-clustering-on-cifar-10 | MMDC | Accuracy: 0.820 Backbone: ResNet18 NMI: 0.703 |
| image-clustering-on-cifar-100 | MMDC | Accuracy: 0.446 NMI: 0.418 |
| image-clustering-on-imagenet-10 | MMDC | Accuracy: 0.811 NMI: 0.719 |
| image-clustering-on-stl-10 | MMDC | Accuracy: 0.694 Backbone: ResNet18 NMI: 0.593 |
| image-clustering-on-tiny-imagenet | MMDC | Accuracy: 0.119 NMI: 0.274 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.