3 months ago

Interactive Image Segmentation With Latent Diversity

{Qifeng Chen Zhuwen Li Vladlen Koltun}

Abstract

Interactive image segmentation is characterized by multimodality. When the user clicks on a door, do they intend to select the door or the whole house? We present an end-to-end learning approach to interactive image segmentation that tackles this ambiguity. Our architecture couples two convolutional networks. The first is trained to synthesize a diverse set of plausible segmentations that conform to the user's input. The second is trained to select among these. By selecting a single solution, our approach retains compatibility with existing interactive segmentation interfaces. By synthesizing multiple diverse solutions before selecting one, the architecture is given the representational power to explore the multimodal solution space. We show that the proposed approach outperforms existing methods for interactive image segmentation, including prior work that applied convolutional networks to this problem, while being much faster.

Benchmarks

Benchmark	Methodology	Metrics
interactive-segmentation-on-davis	Latent diversity	NoC@85: 5.05 NoC@90: 9.57
interactive-segmentation-on-grabcut	Latent diversity	NoC@85: 3.20 NoC@90: 4.79
interactive-segmentation-on-sbd	Latent diversity	NoC@85: 7.41 NoC@90: 10.78

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding

Ready-to-use GPUs

Best Pricing

Get Started

Hyper Newsletters

Subscribe to our latest updates

We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning