HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

ReCo: Retrieve and Co-segment for Zero-shot Transfer

Gyungin Shin; Weidi Xie; Samuel Albanie

ReCo: Retrieve and Co-segment for Zero-shot Transfer

Abstract

Semantic segmentation has a broad range of applications, but its real-world impact has been significantly limited by the prohibitive annotation costs necessary to enable deployment. Segmentation methods that forgo supervision can side-step these costs, but exhibit the inconvenient requirement to provide labelled examples from the target distribution to assign concept names to predictions. An alternative line of work in language-image pre-training has recently demonstrated the potential to produce models that can both assign names across large vocabularies of concepts and enable zero-shot transfer for classification, but do not demonstrate commensurate segmentation abilities. In this work, we strive to achieve a synthesis of these two approaches that combines their strengths. We leverage the retrieval abilities of one such language-image pre-trained model, CLIP, to dynamically curate training sets from unlabelled images for arbitrary collections of concept names, and leverage the robust correspondences offered by modern image representations to co-segment entities among the resulting collections. The synthetic segment collections are then employed to construct a segmentation model (without requiring pixel labels) whose knowledge of concepts is inherited from the scalable pre-training process of CLIP. We demonstrate that our approach, termed Retrieve and Co-segment (ReCo) performs favourably to unsupervised segmentation approaches while inheriting the convenience of nameable predictions and zero-shot transfer. We also demonstrate ReCo's ability to generate specialist segmenters for extremely rare objects.

Code Repositories

NoelShin/reco
Official
pytorch
Mentioned in GitHub
noelshin/namedmask
pytorch
Mentioned in GitHub

Benchmarks

BenchmarkMethodologyMetrics
unsupervised-semantic-segmentation-with-1ReCo+
mIoU: 32.6
pixel accuracy: 54.1
unsupervised-semantic-segmentation-with-1ReCo
mIoU: 26.3
pixel accuracy: 46.1
unsupervised-semantic-segmentation-with-10ReCo
mIoU: 15.7
unsupervised-semantic-segmentation-with-2ReCo
mIoU: 29.8
pixel accuracy: 70.6
unsupervised-semantic-segmentation-with-2ReCo+
mIoU: 31.9
pixel accuracy: 75.3
unsupervised-semantic-segmentation-with-3ReCo+
mIoU: 24.2
pixel accuracy: 83.7
unsupervised-semantic-segmentation-with-3ReCo
mIoU: 19.3
pixel accuracy: 74.6
unsupervised-semantic-segmentation-with-4ReCo
Mean IoU (val): 11.2
unsupervised-semantic-segmentation-with-7ReCo
mIoU: 57.7
unsupervised-semantic-segmentation-with-8ReCo
mIoU: 22.3
unsupervised-semantic-segmentation-with-9ReCo
mIoU: 14.8

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp