HyperAIHyperAI

Command Palette

Search for a command to run...

5 months ago

Mahalanobis Distance-based Multi-view Optimal Transport for Multi-view Crowd Localization

Zhang Qi ; Zhang Kaiyi ; Chan Antoni B. ; Huang Hui

Mahalanobis Distance-based Multi-view Optimal Transport for Multi-view
  Crowd Localization

Abstract

Multi-view crowd localization predicts the ground locations of all people inthe scene. Typical methods usually estimate the crowd density maps on theground plane first, and then obtain the crowd locations. However, theperformance of existing methods is limited by the ambiguity of the density mapsin crowded areas, where local peaks can be smoothed away. To mitigate theweakness of density map supervision, optimal transport-based point supervisionmethods have been proposed in the single-image crowd localization tasks, buthave not been explored for multi-view crowd localization yet. Thus, in thispaper, we propose a novel Mahalanobis distance-based multi-view optimaltransport (M-MVOT) loss specifically designed for multi-view crowdlocalization. First, we replace the Euclidean-based transport cost with theMahalanobis distance, which defines elliptical iso-contours in the costfunction whose long-axis and short-axis directions are guided by the view raydirection. Second, the object-to-camera distance in each view is used to adjustthe optimal transport cost of each location further, where the wrongpredictions far away from the camera are more heavily penalized. Finally, wepropose a strategy to consider all the input camera views in the model loss(M-MVOT) by computing the optimal transport cost for each ground-truth pointbased on its closest camera. Experiments demonstrate the advantage of theproposed method over density map-based or common Euclidean distance-basedoptimal transport loss on several multi-view crowd localization datasets.Project page: https://vcc.tech/research/2024/MVOT.

Benchmarks

BenchmarkMethodologyMetrics
multiview-detection-on-cvcsM-MVOT
MODA (0.5m): 43.5
MODA (1m): /
multiview-detection-on-multiviewxM-MVOT
MODA: 96.7
MODP: 86.1
Recall: 97.9
multiview-detection-on-wildtrackM-MVOT
MODA: 92.1

Build AI with AI

From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.

AI Co-coding
Ready-to-use GPUs
Best Pricing
Get Started

Hyper Newsletters

Subscribe to our latest updates
We will deliver the latest updates of the week to your inbox at nine o'clock every Monday morning
Powered by MailChimp