HyperAIHyperAI

Command Palette

Search for a command to run...

3 months ago

TResNet: High Performance GPU-Dedicated Architecture

Tal Ridnik Hussam Lawen Asaf Noy Emanuel Ben Baruch Gilad Sharir Itamar Friedman

TResNet: High Performance GPU-Dedicated Architecture

Abstract

Many deep learning models, developed in recent years, reach higher ImageNet accuracy than ResNet50, with fewer or comparable FLOPS count. While FLOPs are often seen as a proxy for network efficiency, when measuring actual GPU training and inference throughput, vanilla ResNet50 is usually significantly faster than its recent competitors, offering better throughput-accuracy trade-off. In this work, we introduce a series of architecture modifications that aim to boost neural networks' accuracy, while retaining their GPU training and inference efficiency. We first demonstrate and discuss the bottlenecks induced by FLOPs-optimizations. We then suggest alternative designs that better utilize GPU structure and assets. Finally, we introduce a new family of GPU-dedicated models, called TResNet, which achieve better accuracy and efficiency than previous ConvNets. Using a TResNet model, with similar GPU throughput to ResNet50, we reach 80.8 top-1 accuracy on ImageNet. Our TResNet models also transfer well and achieve state-of-the-art accuracy on competitive single-label classification datasets such as Stanford cars (96.0%), CIFAR-10 (99.0%), CIFAR-100 (91.5%) and Oxford-Flowers (99.1%). They also perform well on multi-label classification and object detection tasks. Implementation is available at: https://github.com/mrT23/TResNet.

Code Repositories

rwightman/pytorch-image-models
Official
pytorch
Mentioned in GitHub
mrT23/TResNet
Official
pytorch
Mentioned in GitHub
Alibaba-MIIL/TResNet
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
fine-grained-image-classification-on-oxfordTResNet-L
Accuracy: 99.1%
image-classification-on-cifar-10TResNet-XL
Percentage correct: 99
image-classification-on-cifar-100TResNet-L-V2
Percentage correct: 92.6
image-classification-on-flowers-102TResNet-L
Accuracy: 99.1%
image-classification-on-imagenetTResNet-XL
Hardware Burden:
Number of params: 77M
Operations per network pass:
Top 1 Accuracy: 84.3%

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp