Add model card and metadata

#2
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +27 -3
README.md CHANGED
@@ -1,3 +1,27 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: image-feature-extraction
4
+ ---
5
+
6
+ # UniPR-3D: Towards Universal Visual Place Recognition with Visual Geometry Grounded Transformer
7
+
8
+ UniPR-3D is a universal visual place recognition (VPR) framework that effectively integrates information from multiple views. It supports both frame-to-frame and sequence-to-sequence matching by leveraging 3D and 2D tokens with tailored aggregation strategies.
9
+
10
+ - **Paper:** [UniPR-3D: Towards Universal Visual Place Recognition with Visual Geometry Grounded Transformer](https://huggingface.co/papers/2512.21078)
11
+ - **Repository:** [https://github.com/dtc111111/UniPR-3D](https://github.com/dtc111111/UniPR-3D)
12
+
13
+ ## Description
14
+ UniPR-3D builds on a Visual Geometry Grounded Transformer (VGGT) backbone capable of encoding multi-view 3D representations. To construct its descriptor, the model jointly leverages 3D tokens and intermediate 2D tokens, using dedicated aggregation modules to capture fine-grained texture cues while reasoning across viewpoints. To further enhance generalization, it incorporates both single- and multi-frame aggregation schemes along with a variable-length sequence retrieval strategy. It achieves state-of-the-art performance on several benchmarks, including MSLS, Pittsburgh, NordLand, and SPED.
15
+
16
+ ## Citation
17
+
18
+ If you find our paper and code useful, please cite us:
19
+
20
+ ```bibtex
21
+ @inproceedings{deng2026_unipr3d,
22
+ title = {UniPR-3D: Towards Universal Visual Place Recognition with 3D Visual Geometry Grounded Transformer},
23
+ author = {Tianchen Deng and Xun Chen and Ziming Li and Hongming Shen and Danwei Wang and Javier Civera and Hesheng Wang},
24
+ booktitle = {Arxiv},
25
+ year = {2026},
26
+ }
27
+ ```