Command Palette
Search for a command to run...
Pierre-François De Plaen Nicola Marinello Marc Proesmans Tinne Tuytelaars Luc Van Gool

Abstract
The DEtection TRansformer (DETR) opened new possibilities for object detection by modeling it as a translation task: converting image features into object-level representations. Previous works typically add expensive modules to DETR to perform Multi-Object Tracking (MOT), resulting in more complicated architectures. We instead show how DETR can be turned into a MOT model by employing an instance-level contrastive loss, a revised sampling strategy and a lightweight assignment method. Our training scheme learns object appearances while preserving detection capabilities and with little overhead. Its performance surpasses the previous state-of-the-art by +2.6 mMOTA on the challenging BDD100K dataset and is comparable to existing transformer-based methods on the MOT17 dataset.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| multi-object-tracking-on-mot17 | ContrasTR | HOTA: 58.9 IDF1: 71.8 MOTA: 73.7 |
| multiple-object-tracking-on-bdd100k-test-1 | ContrasTR | mHOTA: 46.1 mIDF1: 56.5 mMOTA: 42.8 |
| multiple-object-tracking-on-bdd100k-val | ContrasTR | AssocA: - TETA: - mIDF1: 52.9 mMOTA: 41.7 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.