lch01
/

StreamVGGT

@@ -5,7 +5,39 @@ tags:
 pipeline_tag: image-to-3d
 ---
-This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
-- Code: https://github.com/wzzheng/StreamVGGT
-- Paper: https://arxiv.org/abs/2507.11539
-- Docs: https://wzzheng.net/StreamVGGT/

 pipeline_tag: image-to-3d
 ---
+<div align="center">
+<h1>Streaming 4D Visual Geometry Transformer</h1>
+</div>
+### [Paper](https://arxiv.org/abs/2507.11539)  | [Project Page](https://wzzheng.net/StreamVGGT)
+>Streaming 4D Visual Geometry Transformer
+>Dong Zhuo<sup>\*</sup>, [Wenzhao Zheng](https://wzzheng.net/)<sup>*</sup>$\dagger$,  Jiahe Guo, Yuqi Wu, [Jie Zhou](https://scholar.google.com/citations?user=6a79aPwAAAAJ&hl=en&authuser=1), [Jiwen Lu](http://ivg.au.tsinghua.edu.cn/Jiwen_Lu/)
+<sup>*</sup> Equal contribution. $\dagger$ Project leader.
+**StreamVGGT**, a causal transformer architecture for **real-time streaming 4D visual geometry perception** compatiable with LLM-targeted attention mechanism (e.g., [FlashAttention](https://github.com/Dao-AILab/flash-attention)), delivers both fast inference and high-quality 4D reconstruction.
+## Overview
+Given a sequence of images, unlike offline models that require reprocessing the entire sequence and reconstructing the entire scene upon receiving each new image, our StreamVGGT employs temporal
+causal attention and leverages cached memory token to support efficient incremental on-the-fly reconstruction, enabling interative and real-time online applitions.
+## Quick start
+Please refer to our [Github Repo](https://github.com/wzzheng/StreamVGGT).
+## Citation
+If you find this project helpful, please consider citing the following paper:
+```
+@article{streamVGGT,
+      title={Streaming 4D Visual Geometry Transformer},
+      author={Dong Zhuo and Wenzhao Zheng and Jiahe Guo and Yuqi Wu and Jie Zhou and Jiwen Lu},
+      journal={arXiv preprint arXiv:2507.11539},
+      year={2025}
+}
+```