lizizun
/

WinT3R

PyTorch

Model card Files Files and versions

xet

Community

Improve model card: Add pipeline tag, links, and usage

by nielsr HF Staff - opened Sep 8, 2025

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+66

-3

Files changed (1) hide show

README.md +66 -3

README.md CHANGED Viewed

@@ -1,3 +1,66 @@
----
-license: apache-2.0
----

+---
+license: apache-2.0
+pipeline_tag: image-to-3d
+---
+# WinT3R: Window-Based Streaming Reconstruction With Camera Token Pool
+This repository contains the official implementation of **WinT3R**, a feed-forward model that infers precise camera pose and high-quality point map for image streams in an online manner.
+[\ud83d\udcda Paper](https://arxiv.org/abs/2509.05296) - [\ud83c\udf10 Project Page](https://lizizun.github.io/WinT3R.github.io/) - [\ud83d\udcbb Code](https://github.com/LiZizun/WinT3R)
+<p align="center">
+  <img src="https://github.com/LiZizun/WinT3R/raw/main/assets/teaser.jpg" width="100%">
+</p>
+## Overview
+WinT3R addresses the trade-off between reconstruction quality and real-time performance by introducing:
+1.  **An online window mechanism:** Ensures sufficient interaction of image tokens within the same window and across adjacent windows.
+2.  **A camera token pool:** Functions as a lightweight "global memory" to improve camera pose prediction from a global perspective.
+These designs enable WinT3R to achieve state-of-the-art performance in online 3D reconstruction and camera pose estimation, with the fastest reconstruction speed to date.
+## Installation
+```bash
+conda create -n WinT3R python=3.10
+conda activate WinT3R
+pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu118  # use the correct version of cuda for your system
+pip install -r requirements.txt
+```
+## Checkpoints
+Download the checkpoint from [Hugging Face](https://huggingface.co/lizizun/WinT3R/resolve/main/pytorch_model.bin) and place it in the `checkpoints/pytorch_model.bin` directory.
+## Sample Usage (Run Inference from Command Line)
+```bash
+# Run with default example images
+python recon.py
+# Run on your own data
+python recon.py --data_path <path/to/your/images_dir>
+```
+## Citation
+If you find our work useful, please consider citing our paper:
+```bibtex
+@misc{li2025wint3rwindowbasedstreamingreconstruction,
+      title={WinT3R: Window-Based Streaming Reconstruction with Camera Token Pool},
+      author={Zizun Li and Jianjun Zhou and Yifan Wang and Haoyu Guo and Wenzheng Chang and Yang Zhou and Haoyi Zhu and Junyi Chen and Chunhua Shen and Tong He},
+      year={2025},
+      eprint={2509.05296},
+      archivePrefix={arXiv},
+      primaryClass={cs.CV},
+      url={https://arxiv.org/abs/2509.05296},
+}
+```
+## Acknowledgement
+WinT3R is constructed on the outstanding open-source projects. We are extremely grateful for the contributions of these projects and their communities, whose hard work has greatly propelled the development of the field and enabled our work to be realized.
+- [DUSt3R](https://dust3r.europe.naverlabs.com/)
+- [MASt3R](https://github.com/naver/mast3r)
+- [CUT3R](https://github.com/CUT3R/CUT3R)
+- [VGGT](https://github.com/facebookresearch/vggt)
+- [Pi3](https://yyfz.github.io/pi3/)