Command Palette
Search for a command to run...
3D Random Occlusion and Multi-Layer Projection for Deep Multi-Camera Pedestrian Localization
Qiu Rui ; Xu Ming ; Yan Yuyao ; Smith Jeremy S. ; Yang Xi

Abstract
Although deep-learning based methods for monocular pedestrian detection havemade great progress, they are still vulnerable to heavy occlusions. Usingmulti-view information fusion is a potential solution but has limitedapplications, due to the lack of annotated training samples in existingmulti-view datasets, which increases the risk of overfitting. To address thisproblem, a data augmentation method is proposed to randomly generate 3Dcylinder occlusions, on the ground plane, which are of the average size ofpedestrians and projected to multiple views, to relieve the impact ofoverfitting in the training. Moreover, the feature map of each view isprojected to multiple parallel planes at different heights, by usinghomographies, which allows the CNNs to fully utilize the features across theheight of each pedestrian to infer the locations of pedestrians on the groundplane. The proposed 3DROM method has a greatly improved performance incomparison with the state-of-the-art deep-learning based methods for multi-viewpedestrian detection.
Code Repositories
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| multiview-detection-on-citystreet | 3DROM | F1_score (2m): 79.2 MODA (2m): 60.0 MODP (2m): 70.1 Precision (2m): 82.5 Recall (2m): 76.2 |
| multiview-detection-on-cvcs | 3DROM | F1_score (1m): 55.1 MODA (1m): 33.9 MODP (1m): 73.9 Precision (1m): 79.5 Recall (1m): 42.2 |
| multiview-detection-on-multiviewx | 3DROM | MODA: 90.0 MODP: 83.7 |
| multiview-detection-on-wildtrack | 3DROM | MODA: 93.5 MODP: 75.9 Recall: 96.2 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.