HyperAIHyperAI

Command Palette

Search for a command to run...

4 months ago

Class-Balanced Loss Based on Effective Number of Samples

Yin Cui; Menglin Jia; Tsung-Yi Lin; Yang Song; Serge Belongie

Class-Balanced Loss Based on Effective Number of Samples

Abstract

With the rapid increase of large-scale, real-world datasets, it becomes critical to address the problem of long-tailed data distribution (i.e., a few classes account for most of the data, while most classes are under-represented). Existing solutions typically adopt class re-balancing strategies such as re-sampling and re-weighting based on the number of observations for each class. In this work, we argue that as the number of samples increases, the additional benefit of a newly added data point will diminish. We introduce a novel theoretical framework to measure data overlap by associating with each sample a small neighboring region rather than a single point. The effective number of samples is defined as the volume of samples and can be calculated by a simple formula $(1-β^{n})/(1-β)$, where $n$ is the number of samples and $β\in [0,1)$ is a hyperparameter. We design a re-weighting scheme that uses the effective number of samples for each class to re-balance the loss, thereby yielding a class-balanced loss. Comprehensive experiments are conducted on artificially induced long-tailed CIFAR datasets and large-scale datasets including ImageNet and iNaturalist. Our results show that when trained with the proposed class-balanced loss, the network is able to achieve significant performance gains on long-tailed datasets.

Benchmarks

BenchmarkMethodologyMetrics
image-classification-on-inaturalist-2018ResNet-152
Top-1 Accuracy: 69.05%
image-classification-on-inaturalist-2018ResNet-101
Top-1 Accuracy: 67.98%
image-classification-on-inaturalist-2018ResNet-50
Top-1 Accuracy: 64.16%
long-tail-learning-on-cifar-10-lt-r-10Class-balanced Focal Loss
Error Rate: 12.90
long-tail-learning-on-cifar-10-lt-r-10Class-balanced Reweighting
Error Rate: 13.46
long-tail-learning-on-cifar-100-lt-r-100Cross-Entropy (CE)
Error Rate: 61.68
long-tail-learning-on-coco-mltCB Loss(ResNet-50)
Average mAP: 49.06
long-tail-learning-on-egteaCB Loss
Average Precision: 63.39
Average Recall: 63.26
long-tail-learning-on-voc-mltCB Focal(ResNet-50)
Average mAP: 75.24

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp