Command Palette
Search for a command to run...
Zeng Fangao ; Dong Bin ; Zhang Yuang ; Wang Tiancai ; Zhang Xiangyu ; Wei Yichen

Abstract
Temporal modeling of objects is a key challenge in multiple object tracking(MOT). Existing methods track by associating detections through motion-basedand appearance-based similarity heuristics. The post-processing nature ofassociation prevents end-to-end exploitation of temporal variations in videosequence. In this paper, we propose MOTR, which extends DETR and introducestrack query to model the tracked instances in the entire video. Track query istransferred and updated frame-by-frame to perform iterative prediction overtime. We propose tracklet-aware label assignment to train track queries andnewborn object queries. We further propose temporal aggregation network andcollective average loss to enhance temporal relation modeling. Experimentalresults on DanceTrack show that MOTR significantly outperforms state-of-the-artmethod, ByteTrack by 6.5% on HOTA metric. On MOT17, MOTR outperforms ourconcurrent works, TrackFormer and TransTrack, on association performance. MOTRcan serve as a stronger baseline for future research on temporal modeling andTransformer-based trackers. Code is available athttps://github.com/megvii-research/MOTR.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| multi-object-tracking-on-dancetrack | MOTR | AssA: 40.2 DetA: 73.5 HOTA: 54.2 IDF1: 51.5 MOTA: 79.7 |
| multi-object-tracking-on-mot16 | MOTR | IDF1: 67.0 MOTA: 66.8 |
| multi-object-tracking-on-mot17 | MOTR | IDF1: 67.0 MOTA: 67.4 e2e-MOT: Yes |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.