Command Palette
Search for a command to run...
Memory Aggregation Networks for Efficient Interactive Video Object Segmentation
Miao Jiaxu ; Wei Yunchao ; Yang Yi

Abstract
Interactive video object segmentation (iVOS) aims at efficiently harvestinghigh-quality segmentation masks of the target object in a video with userinteractions. Most previous state-of-the-arts tackle the iVOS with twoindependent networks for conducting user interaction and temporal propagation,respectively, leading to inefficiencies during the inference stage. In thiswork, we propose a unified framework, named Memory Aggregation Networks(MA-Net), to address the challenging iVOS in a more efficient way. Our MA-Netintegrates the interaction and the propagation operations into a singlenetwork, which significantly promotes the efficiency of iVOS in the scheme ofmulti-round interactions. More importantly, we propose a simple yet effectivememory aggregation mechanism to record the informative knowledge from theprevious interaction rounds, improving the robustness in discoveringchallenging objects of interest greatly. We conduct extensive experiments onthe validation set of DAVIS Challenge 2018 benchmark. In particular, our MA-Netachieves the J@60 score of 76.1% without any bells and whistles, outperformingthe state-of-the-arts with more than 2.7%.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| interactive-video-object-segmentation-on | MA-Net | AUC-J: 0.749 J@60s: 0.761 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.