Command Palette
Search for a command to run...
EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition

Abstract
The task of Visual Place Recognition (VPR) is to predict the location of a query image from a database of geo-tagged images. Recent studies in VPR have highlighted the significant advantage of employing pre-trained foundation models like DINOv2 for the VPR task. However, these models are often deemed inadequate for VPR without further fine-tuning on VPR-specific data. In this paper, we present an effective approach to harness the potential of a foundation model for VPR. We show that features extracted from self-attention layers can act as a powerful re-ranker for VPR, even in a zero-shot setting. Our method not only outperforms previous zero-shot approaches but also introduces results competitive with several supervised methods. We then show that a single-stage approach utilizing internal ViT layers for pooling can produce global features that achieve state-of-the-art performance, with impressive feature compactness down to 128D. Moreover, integrating our local foundation features for re-ranking further widens this performance gap. Our method also demonstrates exceptional robustness and generalization, setting new state-of-the-art performance, while handling challenging conditions such as occlusion, day-night transitions, and seasonal variations.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| visual-place-recognition-on-amstertime | EffoVPR | Recall@1: 65.5 |
| visual-place-recognition-on-eynsham | EffoVPR | Recall@1: 91.0 |
| visual-place-recognition-on-mapillary-test | EffoVPR | Recall@1: 79.0 Recall@10: 91.6 Recall@5: 89.0 |
| visual-place-recognition-on-mapillary-val | EffoVPR | Recall@1: 92.8 Recall@10: 97.4 Recall@5: 97.2 |
| visual-place-recognition-on-nordland | EffoVPR | Recall@1: 95.0 Recall@5: 98.6 |
| visual-place-recognition-on-pittsburgh-30k | EffoVPR | Recall@1: 93.9 Recall@5: 97.4 |
| visual-place-recognition-on-san-francisco | EffoVPR | Recall@1: 93.0 |
| visual-place-recognition-on-sf-xl-test-v1 | EffoVPR | Recall@1: 95.5 Recall@10: 98.1 |
| visual-place-recognition-on-sf-xl-test-v2 | EffoVPR | Recall@1: 94.5 Recall@10: 97.8 Recall@5: 98.2 |
| visual-place-recognition-on-st-lucia | EffoVPR | Recall@1: 100.0 Recall@5: 100.0 |
| visual-place-recognition-on-tokyo247 | EffoVPR | Recall@1: 98.7 Recall@10: 98.7 Recall@5: 98.7 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.