HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Your Diffusion Model is Secretly a Zero-Shot Classifier

Alexander C. Li; Mihir Prabhudesai; Shivam Duggal; Ellis Brown; Deepak Pathak

Your Diffusion Model is Secretly a Zero-Shot Classifier

Abstract

The recent wave of large-scale text-to-image diffusion models has dramatically increased our text-based image generation abilities. These models can generate realistic images for a staggering variety of prompts and exhibit impressive compositional generalization abilities. Almost all use cases thus far have solely focused on sampling; however, diffusion models can also provide conditional density estimates, which are useful for tasks beyond image generation. In this paper, we show that the density estimates from large-scale text-to-image diffusion models like Stable Diffusion can be leveraged to perform zero-shot classification without any additional training. Our generative approach to classification, which we call Diffusion Classifier, attains strong results on a variety of benchmarks and outperforms alternative methods of extracting knowledge from diffusion models. Although a gap remains between generative and discriminative approaches on zero-shot recognition tasks, our diffusion-based approach has significantly stronger multimodal compositional reasoning ability than competing discriminative approaches. Finally, we use Diffusion Classifier to extract standard classifiers from class-conditional diffusion models trained on ImageNet. Our models achieve strong classification performance using only weak augmentations and exhibit qualitatively better "effective robustness" to distribution shift. Overall, our results are a step toward using generative over discriminative models for downstream tasks. Results and visualizations at https://diffusion-classifier.github.io/

Code Repositories

SamsungSAILMontreal/ForestDiffusion
pytorch
Mentioned in GitHub
diffusion-classifier/diffusion-classifier
Official
pytorch
Mentioned in GitHub
tajamul21/fate
tf
Mentioned in GitHub
LiYinqi/DIVE
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
domain-generalization-on-imagenet-aDiffusion Classifier
Top-1 accuracy %: 30.2
fine-grained-image-classification-on-fgvcDiffusion Classifier (zero-shot)
Accuracy: 26.4
image-classification-on-cifar-10Diffusion Classifier (zero-shot)
Percentage correct: 88.5
image-classification-on-flowers-102Diffusion Classifier (zero-shot)
Per-Class Accuracy: 66.3
image-classification-on-imagenetDiffusion Classifier
Top 1 Accuracy: 79.1%
image-classification-on-objectnet-imagenetDiffusion Classifier
Top 1 Accuracy: 33.9
image-classification-on-objectnet-imagenetDiffusion Classifier (zero-shot)
Top 1 Accuracy: 43.4
image-classification-on-oxford-iiit-pets-1Diffusion Classifier (zero-shot)
Per-Class Accuracy: 87.3
image-classification-on-stl-10Diffusion Classifier (zero-shot)
Percentage correct: 95.4
visual-reasoning-on-winogroundDiffusion Classifier (zero-shot)
Text Score: 34.00
zero-shot-transfer-image-classification-on-1Diffusion Classifier (zero-shot)
Accuracy (Private): 61.4
zero-shot-transfer-image-classification-on-17Diffusion Classifier (zero-shot)
Top 1 Accuracy: 77.7

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp