HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Generalization

Abstract

We introduce four new real-world distribution shift datasets consisting of changes in image style, image blurriness, geographic location, camera operation, and more. With our new datasets, we take stock of previously proposed methods for improving out-of-distribution robustness and put them to the test. We find that using larger models and artificial data augmentations can improve robustness on real-world distribution shifts, contrary to claims in prior work. We find improvements in artificial robustness benchmarks can transfer to real-world distribution shifts, contrary to claims in prior work. Motivated by our observation that data augmentations can help with real-world distribution shifts, we also introduce a new data augmentation method which advances the state-of-the-art and outperforms models pretrained with 1000 times more labeled data. Overall we find that some methods consistently help with distribution shifts in texture and local image statistics, but these methods do not help with some other distribution shifts like geographic changes. Our results show that future research must study multiple distribution shifts simultaneously, as we demonstrate that no evaluated method consistently improves robustness.

Code Repositories

hendrycks/imagenet-r
Official
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
domain-generalization-on-imagenet-cDeepAugment (ResNet-50)
mean Corruption Error (mCE): 60.4
domain-generalization-on-imagenet-rDeepAugment (ResNet-50)
Top-1 Error Rate: 57.8
domain-generalization-on-imagenet-rDeepAugment+AugMix (ResNet-50)
Top-1 Error Rate: 53.2
domain-generalization-on-vizwizResNet-50 (deepaugment)
Accuracy - All Images: 41.3
Accuracy - Clean Images: 46
Accuracy - Corrupted Images: 34.9
domain-generalization-on-vizwizResNet-50 (deepaugment+augmix)
Accuracy - All Images: 40.3
Accuracy - Clean Images: 44.5
Accuracy - Corrupted Images: 34.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp