HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

EVA-CLIP: Improved Training Techniques for CLIP at Scale

Quan Sun; Yuxin Fang; Ledell Wu; Xinlong Wang; Yue Cao

EVA-CLIP: Improved Training Techniques for CLIP at Scale

Abstract

Contrastive language-image pre-training, CLIP for short, has gained increasing attention for its potential in various scenarios. In this paper, we propose EVA-CLIP, a series of models that significantly improve the efficiency and effectiveness of CLIP training. Our approach incorporates new techniques for representation learning, optimization, and augmentation, enabling EVA-CLIP to achieve superior performance compared to previous CLIP models with the same number of parameters but significantly smaller training costs. Notably, our largest 5.0B-parameter EVA-02-CLIP-E/14+ with only 9 billion seen samples achieves 82.0 zero-shot top-1 accuracy on ImageNet-1K val. A smaller EVA-02-CLIP-L/14+ with only 430 million parameters and 6 billion seen samples achieves 80.4 zero-shot top-1 accuracy on ImageNet-1K val. To facilitate open access and open research, we release the complete suite of EVA-CLIP to the community at https://github.com/baaivision/EVA/tree/master/EVA-CLIP.

Code Repositories

Yui010206/CREMA
pytorch
Mentioned in GitHub
baaivision/eva
Official
pytorch
jaehong31/raccoon
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
image-classification-on-objectnetEVA-02-CLIP-E/14+
Top-1 Accuracy: 79.6
zero-shot-action-recognition-on-ucf101EVA-CLIP-E/14+
Top-1 Accuracy: 83.1
zero-shot-transfer-image-classification-on-1EVA-CLIP-E/14+
Accuracy (Private): 82
zero-shot-transfer-image-classification-on-17EVA-CLIP-E/14+
Top 1 Accuracy: 94.9
zero-shot-transfer-image-classification-on-3EVA-CLIP-E/14+
Accuracy (Private): 75.7
zero-shot-transfer-image-classification-on-4EVA-CLIP-E/14+
Accuracy: 94.5
zero-shot-transfer-image-classification-on-5EVA-CLIP-E/14+
Accuracy (Private): 82.1
zero-shot-transfer-image-classification-on-6EVA-CLIP-E/14+
Accuracy (Private): 79.6
zero-shot-transfer-image-classification-on-8EVA-CLIP-E/14+
Accuracy (Private): 71.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp