Improve model card: Add pipeline tag, links, and usage

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +66 -3
README.md CHANGED
@@ -1,3 +1,66 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: image-to-3d
4
+ ---
5
+
6
+ # WinT3R: Window-Based Streaming Reconstruction With Camera Token Pool
7
+
8
+ This repository contains the official implementation of **WinT3R**, a feed-forward model that infers precise camera pose and high-quality point map for image streams in an online manner.
9
+
10
+ [\ud83d\udcda Paper](https://arxiv.org/abs/2509.05296) - [\ud83c\udf10 Project Page](https://lizizun.github.io/WinT3R.github.io/) - [\ud83d\udcbb Code](https://github.com/LiZizun/WinT3R)
11
+
12
+ <p align="center">
13
+ <img src="https://github.com/LiZizun/WinT3R/raw/main/assets/teaser.jpg" width="100%">
14
+ </p>
15
+
16
+ ## Overview
17
+ WinT3R addresses the trade-off between reconstruction quality and real-time performance by introducing:
18
+ 1. **An online window mechanism:** Ensures sufficient interaction of image tokens within the same window and across adjacent windows.
19
+ 2. **A camera token pool:** Functions as a lightweight "global memory" to improve camera pose prediction from a global perspective.
20
+
21
+ These designs enable WinT3R to achieve state-of-the-art performance in online 3D reconstruction and camera pose estimation, with the fastest reconstruction speed to date.
22
+
23
+ ## Installation
24
+
25
+ ```bash
26
+ conda create -n WinT3R python=3.10
27
+ conda activate WinT3R
28
+ pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118 # use the correct version of cuda for your system
29
+ pip install -r requirements.txt
30
+ ```
31
+
32
+ ## Checkpoints
33
+ Download the checkpoint from [Hugging Face](https://huggingface.co/lizizun/WinT3R/resolve/main/pytorch_model.bin) and place it in the `checkpoints/pytorch_model.bin` directory.
34
+
35
+ ## Sample Usage (Run Inference from Command Line)
36
+
37
+ ```bash
38
+ # Run with default example images
39
+ python recon.py
40
+
41
+ # Run on your own data
42
+ python recon.py --data_path <path/to/your/images_dir>
43
+ ```
44
+
45
+ ## Citation
46
+ If you find our work useful, please consider citing our paper:
47
+ ```bibtex
48
+ @misc{li2025wint3rwindowbasedstreamingreconstruction,
49
+ title={WinT3R: Window-Based Streaming Reconstruction with Camera Token Pool},
50
+ author={Zizun Li and Jianjun Zhou and Yifan Wang and Haoyu Guo and Wenzheng Chang and Yang Zhou and Haoyi Zhu and Junyi Chen and Chunhua Shen and Tong He},
51
+ year={2025},
52
+ eprint={2509.05296},
53
+ archivePrefix={arXiv},
54
+ primaryClass={cs.CV},
55
+ url={https://arxiv.org/abs/2509.05296},
56
+ }
57
+ ```
58
+
59
+ ## Acknowledgement
60
+ WinT3R is constructed on the outstanding open-source projects. We are extremely grateful for the contributions of these projects and their communities, whose hard work has greatly propelled the development of the field and enabled our work to be realized.
61
+
62
+ - [DUSt3R](https://dust3r.europe.naverlabs.com/)
63
+ - [MASt3R](https://github.com/naver/mast3r)
64
+ - [CUT3R](https://github.com/CUT3R/CUT3R)
65
+ - [VGGT](https://github.com/facebookresearch/vggt)
66
+ - [Pi3](https://yyfz.github.io/pi3/)