HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

Masked Autoencoders Are Scalable Vision Learners

Kaiming He Xinlei Chen Saining Xie Yanghao Li Piotr Dollár Ross Girshick

Masked Autoencoders Are Scalable Vision Learners

Abstract

This paper shows that masked autoencoders (MAE) are scalable self-supervised learners for computer vision. Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels. It is based on two core designs. First, we develop an asymmetric encoder-decoder architecture, with an encoder that operates only on the visible subset of patches (without mask tokens), along with a lightweight decoder that reconstructs the original image from the latent representation and mask tokens. Second, we find that masking a high proportion of the input image, e.g., 75%, yields a nontrivial and meaningful self-supervisory task. Coupling these two designs enables us to train large models efficiently and effectively: we accelerate training (by 3x or more) and improve accuracy. Our scalable approach allows for learning high-capacity models that generalize well: e.g., a vanilla ViT-Huge model achieves the best accuracy (87.8%) among methods that use only ImageNet-1K data. Transfer performance in downstream tasks outperforms supervised pre-training and shows promising scaling behavior.

Code Repositories

islamosmanubc/MedMAE
pytorch
Mentioned in GitHub
keytoyze/visionts
pytorch
Mentioned in GitHub
alicebizeul/pmae
pytorch
Mentioned in GitHub
xplip/pixel
pytorch
Mentioned in GitHub
qiaopTDUN/mae-repo
pytorch
Mentioned in GitHub
guilk/vlc
pytorch
Mentioned in GitHub
lightly-ai/lightly
pytorch
Mentioned in GitHub
Nullius-2020/MAE-Paddle
paddle
Mentioned in GitHub
facebookresearch/vip-mae
pytorch
Mentioned in GitHub
aHapBean/PCP-MAE
pytorch
Mentioned in GitHub
Westlake-AI/openmixup
pytorch
Mentioned in GitHub
Ugenteraan/Masked-AutoEncoder-PyTorch
pytorch
Mentioned in GitHub
FlyEgle/MAE-pytorch
pytorch
Mentioned in GitHub
leaplabthu/efficienttrain
pytorch
Mentioned in GitHub
zinengtang/tvlt
pytorch
Mentioned in GitHub
zhangq327/u-mae
pytorch
Mentioned in GitHub
SnailDev/github-hot-hub
pytorch
Mentioned in GitHub
pengzhiliang/MAE-pytorch
pytorch
Mentioned in GitHub
dispink/xpt
pytorch
Mentioned in GitHub
virajprabhu/pacmac
pytorch
Mentioned in GitHub
BUPT-PRIV/MAE-priv
pytorch
Mentioned in GitHub
yifanzhang-pro/m-mae
pytorch
Mentioned in GitHub
mx-mark/videotransformer-pytorch
pytorch
Mentioned in GitHub
facebookresearch/hiera
pytorch
Mentioned in GitHub
oneflow-inc/libai
Mentioned in GitHub
facebookresearch/mae
Official
pytorch
Mentioned in GitHub
innat/VideoMAE
tf
Mentioned in GitHub
2020132075/conmae
pytorch
Mentioned in GitHub
dravenww/curated-article
tf
Mentioned in GitHub
DarshanDeshpande/jax-models
jax
Mentioned in GitHub
bwconrad/masked-autoencoder
pytorch
Mentioned in GitHub
liujiyuan13/MAE-code
pytorch
Mentioned in GitHub
IcarusWizard/MAE
pytorch
Mentioned in GitHub
isaaccorley/hydro-foundation-model
pytorch
Mentioned in GitHub
kit-mrt/masked-fusion-360
pytorch
Mentioned in GitHub
dominickrei/limited-data-vits
pytorch
Mentioned in GitHub
wangsr126/mae-lite
pytorch
Mentioned in GitHub
nasa-impact/hls-foundation-os
pytorch
Mentioned in GitHub
hkbu-vscomputing/2022_mm_dmae-mocap
pytorch
Mentioned in GitHub
yangsun22/tc-moa
pytorch
Mentioned in GitHub
lonnyzhang423/github-hot-hub
pytorch
Mentioned in GitHub
open-mmlab/mmselfsup
pytorch
Mentioned in GitHub
0jason000/mae_vit
mindspore
Mentioned in GitHub
facebookresearch/multimodal
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
domain-generalization-on-imagenet-aMAE (ViT-H, 448)
Top-1 accuracy %: 76.7
domain-generalization-on-imagenet-cMAE (ViT-H)
Number of params: 632M
mean Corruption Error (mCE): 33.8
domain-generalization-on-imagenet-rMAE (ViT-H, 448)
Top-1 Error Rate: 33.5
domain-generalization-on-imagenet-sketchMAE (ViT-H, 448)
Top-1 accuracy: 50.9
image-classification-on-imagenetMAE (ViT-L)
Top 1 Accuracy: 85.9%
image-classification-on-imagenetMAE (ViT-H, 448)
Number of params: 656M
Top 1 Accuracy: 87.8%
image-classification-on-imagenetMAE (ViT-L)
Top 1 Accuracy: 83.6%
image-classification-on-imagenetMAE (ViT-H)
Top 1 Accuracy: 86.9%
image-classification-on-inaturalistMAE (ViT-H, 448)
Top 1 Accuracy: 83.4
image-classification-on-inaturalist-2018MAE (ViT-H, 448)
Top-1 Accuracy: 86.8%
image-classification-on-inaturalist-2019MAE (ViT-H, 448)
Top-1 Accuracy: 88.3
image-classification-on-omnibenchmarkMAE
Average Top-1 Accuracy: 30.6
image-classification-on-places205MAE (ViT-H, 448)
Top 1 Accuracy: 66.8
image-classification-on-places365-standardMAE (ViT-H, 448)
Top 1 Accuracy: 60.3
object-detection-on-coco-minivalMAE (ViT-L, Mask R-CNN)
box AP: 53.3
object-detection-on-coco-minivalMAE (ViT-B, Mask R-CNN)
box AP: 50.3
self-supervised-image-classification-onMAE (ViT-B)
Number of Params: 80M
Top 1 Accuracy: 68.0%
self-supervised-image-classification-onMAE (ViT-L)
Number of Params: 306M
Top 1 Accuracy: 75.8%
self-supervised-image-classification-onMAE (ViT-H)
Number of Params: 700M
Top 1 Accuracy: 76.6%
self-supervised-image-classification-on-1MAE (ViT-H/14)
Top 1 Accuracy: 86.9%
self-supervised-image-classification-on-1MAE (ViT-H/14, 448)
Number of Params: 632M
Top 1 Accuracy: 87.8%
semantic-segmentation-on-ade20kMAE (ViT-B, UperNet)
Validation mIoU: 48.1
semantic-segmentation-on-ade20kMAE (ViT-L, UperNet)
Validation mIoU: 53.6
semantic-segmentation-on-imagenet-sMAE (ViT-B/16, 224x224, SSL+FT)
mIoU (test): 60.2
mIoU (val): 61.0
semantic-segmentation-on-imagenet-sMAE (ViT-B/16, 224x224, SSL)
mIoU (test): 37.0
mIoU (val): 38.3
semantic-segmentation-on-imagenet-sMAE (ViT-B/16, 224x224, SSL, mmseg)
mIoU (test): 40.3
mIoU (val): 40.0
semantic-segmentation-on-imagenet-sMAE (ViT-B/16, 224x224, SSL+FT, mmseg)
mIoU (test): 61.2
mIoU (val): 61.6

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp