Command Palette
Search for a command to run...
Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation
Cho Suhwan ; Lee Minhyeok ; Lee Jungho ; Cho MyeongAh ; Park Seungwook ; Kim Jaeyeob ; Jang Hyunsung ; Lee Sangyoun

Abstract
Unsupervised video object segmentation aims to detect the most salient objectin a video without any external guidance regarding the object. Salient objectsoften exhibit distinctive movements compared to the background, and recentmethods leverage this by combining motion cues from optical flow maps withappearance cues from RGB images. However, because optical flow maps are oftenclosely correlated with segmentation masks, networks can become overlydependent on motion cues during training, leading to vulnerability when facedwith confusing motion cues and resulting in unstable predictions. To addressthis challenge, we propose a novel motion-as-option network that treats motioncues as an optional component rather than a necessity. During training, werandomly input RGB images into the motion encoder instead of optical flow maps,which implicitly reduces the network's reliance on motion cues. This designensures that the motion encoder is capable of processing both RGB images andoptical flow maps, leading to two distinct predictions depending on the type ofinput provided. To make the most of this flexibility, we introduce an adaptiveoutput selection algorithm that determines the optimal prediction duringtesting.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| unsupervised-video-object-segmentation-on-11 | TMO++ (MiT-b1) | J: 83.2 |
| unsupervised-video-object-segmentation-on-11 | TMO++ (RN-101) | J: 81.2 |
| unsupervised-video-object-segmentation-on-12 | TMO++ (RN-101) | J: 73.1 |
| unsupervised-video-object-segmentation-on-12 | TMO++ (MiT-b1, MS) | J: 73.5 |
| unsupervised-video-object-segmentation-on-12 | TMO++ (MiT-b1) | J: 73.0 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.