Home Console Docs News Papers Tutorials Datasets Wiki SOTA LLM Models GPU Leaderboard Events

English

Image Classification

Image classification is a fundamental task in computer vision, aiming to understand and categorize entire images by assigning them specific labels. This task typically targets images of single objects and achieves high-precision classification through technologies such as deep learning, with broad application value including content recognition and scene understanding. When classification reaches the instance level, it becomes associated with image retrieval, which also involves finding similar images in large databases.

DINOv2 (ViT-g/14, frozen model, linear eval)

EffNet-L2 (SAM)

µ2Net+ (ViT-L/16)

BiT-L (ResNet-152x4)

Branching/Merging CNN + Homogeneous Vector Capsules

Wide-ResNet-28-10

iNaturalist 2018

MAE (ViT-H, 448)

mini WebVision 1.0

ALIGN (50 hypers/task)

PreAct-ResNet18 + FMix

Model soups (ViT-G/14)

Kuzushiji-MNIST

Tiny ImageNet Classification

iNaturalist 2019

EMNIST-Balanced

WaveMixLite-128/7

ViT-Large/16 (384)

ViT-Large/16 (384)

ColonINST-v1 (Unseen)

ColonINST-v1 (Seen)

CurriculumNet (InceptionResNet-v2)

MAE (ViT-H, 448)

µ2Net+ (ViT-L/16)

VGG-5(Spinal FC)

Clothing1M (using clean data)

VIT-L/16 (Spinal FC, Background)

Heinsen Routing

µ2Net (ViT-L/16)

InternImage-H（CNN）

Tiered ImageNet 5-way (5-shot)

EGNN+Transduction

iWildCam2020-WILDS

Colored-MNIST(with spurious correlation)

Bamboo (ViTB/16)

Oxford-IIIT Pets

CeiT-S (384 finetune resolution)

Red MiniImageNet 20% label noise

Red MiniImageNet 40% label noise

Oxford-IIIT Pet Dataset

TWIST (ResNet-50)

Red MiniImageNet 80% label noise

Entropy-based Logic Explained Network

EfficientNet-B3

ObjectNet (Bounding Box)

CIFAR-10 (with noisy labels)

Places365-Standard

SWAG (ViT H/14)

ResNet-18 + Vision Eagle Attention

V-MoE-H/14 (Every-2)

Red MiniImageNet 60% label noise

Visual Wake Words

LRA-diffusion (CLIP ViT)

Malaria Dataset

kEffNet-B0 V2 16ch

Id Pattern Dataset

Imbalanced CUB-200-2011

Noisy MNIST (Contrast)

CIFAR-10, 40% Symmetric Noise

split CIFAR-100

Galaxy10 DECals

CIFAR-10, 60% Symmetric Noise

ObjectNet (ImageNet classes)

Diffusion Classifier (zero-shot)

SEER (RegNet10B)

CIFAR-10 (40 Labels, ImageNet-100 Unlabeled)

EfficientNet-L2-Ns

Fracture/Normal Shoulder Bone X-ray Images on MURA

Our Ensemble Learning-2

Intel Image Classification

SparseSwin with L2

Certificate Verification

CIFAR-100, 40% Symmetric Noise

Large Labelled Logo Dataset (L3D)

L3D_original_2level

ResNet-50 + UDA+AutoDropout

SEER (RegNet10B)

Noisy MNIST (Motion)

CIFAR-10 Image Classification

Noisy MNIST (AWGN)

CIFAR-100 (alpha=0, 20 clients per round)

WRN-28-2 + UDA+AutoDropout

RADAM (ConvNeXt-XL)

RGB Arabic Alphabet Sign Language (AASL) dataset

CIFAR-100, 60% Symmetric Noise

µ2Net (ViT-L/16)

NCT-CRC-HE-100K

MNIST-rot-12k (DA)

PDO-eConv (ours)

Model with negotiation paradigm

TransBoost-ResNet50

SEER (RegNet10B)

SqueezeNet + Simple Bypass

Fuzzy rank-based fusion of CNN models using Gompertz function

WaveMix-256/16 (level 2)

Training and validation dataset of capsule vision 2024 challenge.

BiomedCLIP+PubmedBERT

ImageNet-Sketch

µ2Net+ (ViT-L/16)

No Background RGB Arabic Alphabets Sign Language Dataset

AP-GeM (ResNet-101)

µ2Net+ (ViT-L/16)

ResNet-152 2x (RS training)

VizWiz-Classification

ImageNet-100 (Class-IL, 5T)

Stanford Online Products

WRN (N=28, k=10)

PASCAL VOC 2007

kMobileNet V3 Large 16ch

PDO-eConv (ours)

TransBoost-ResNet50

Deep regularization

Max Margin Contrastive

FMD (materials)

Flowers (Tensorflow)

CNN+ Wilson-Cowan model RNN

Split Fashion M-NIST

EnGraf-Net101 (G=4, H=1)

WRN (N=36, k=5)

touchtech/fashion-images-gender-age

ISIC 2018+Atlas Dermatology

New Plant Diseases Dataset