4 months ago

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui; Menglin Jia; Tsung-Yi Lin; Yang Song; Serge Belongie

Abstract

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-β^{n})/(1-β)$, where $n$ is the number of samples and $β\in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

Code Repositories

feidfoe/AdjustBnd4Imbalance

pytorch

Mentioned in GitHub

MindCode-4/code-11/tree/main/Class-balanced-loss-pytorch-master

mindspore

bazinga699/ncl

pytorch

Mentioned in GitHub

frgfm/Holocron

pytorch

Mentioned in GitHub

tiagoCuervo/JapaNet

Mentioned in GitHub

richardaecn/class-balanced-loss

Official

Mentioned in GitHub

vandit15/Class-balanced-loss-pytorch

pytorch

Mentioned in GitHub

MindSpore-scientific/code-3/tree/main/Class-balanced-loss-pytorch-master

mindspore

MindCode-4/code-6/tree/main/Class-balanced-loss-pytorch-master

mindspore

statsu1990/yoto_class_balanced_loss

pytorch

Mentioned in GitHub

lijian16/fcc

pytorch

Mentioned in GitHub

Benchmarks

Benchmark	Methodology	Metrics
image-classification-on-inaturalist-2018	ResNet-152	Top-1 Accuracy: 69.05%
image-classification-on-inaturalist-2018	ResNet-101	Top-1 Accuracy: 67.98%
image-classification-on-inaturalist-2018	ResNet-50	Top-1 Accuracy: 64.16%
long-tail-learning-on-cifar-10-lt-r-10	Class-balanced Focal Loss	Error Rate: 12.90
long-tail-learning-on-cifar-10-lt-r-10	Class-balanced Reweighting	Error Rate: 13.46
long-tail-learning-on-cifar-100-lt-r-100	Cross-Entropy (CE)	Error Rate: 61.68
long-tail-learning-on-coco-mlt	CB Loss(ResNet-50)	Average mAP: 49.06
long-tail-learning-on-egtea	CB Loss	Average Precision: 63.39 Average Recall: 63.26
long-tail-learning-on-voc-mlt	CB Focal(ResNet-50)	Average mAP: 75.24

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning

Command Palette

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui; Menglin Jia; Tsung-Yi Lin; Yang Song; Serge Belongie

Abstract

Code Repositories

Benchmarks

Build AI with AI

Hyper Newsletters