Image-to-3D
QuantVGGT / README.md
nielsr's picture
nielsr HF Staff
Add pipeline tag and improve model card documentation
c699f67 verified
|
raw
history blame
2.4 kB
---
license: mit
pipeline_tag: image-to-3d
---
# Quantized Visual Geometry Grounded Transformer
[![arXiv](https://img.shields.io/badge/QuantVGGT-2509.21302-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2509.21302)
[![GitHub](https://img.shields.io/badge/GitHub-Code-blue?style=flat-square&logo=github)](https://github.com/wlfeng0509/QuantVGGT)
This repository contains the weights and calibration data for **QuantVGGT**, presented in the paper [Quantized Visual Geometry Grounded Transformer](https://arxiv.org/abs/2509.21302).
QuantVGGT is the first quantization framework specifically designed for Visual Geometry Grounded Transformers (VGGTs). It addresses unique challenges in compressing billion-scale 3D reconstruction models, such as heavy-tailed activation distributions and multi-view calibration instability.
## Installation
To get started, clone the official repository and install the dependencies:
```bash
git clone https://github.com/wlfeng0509/QuantVGGT.git
cd QuantVGGT
pip install -r requirements.txt
pip install -r requirements_demo.txt
```
## Quick Start
You can use the provided scripts for inference and calibration. For example, to generate filtered Co3D calibration data:
```bash
python Quant_VGGT/vggt/evaluation/make_calibation.py \
--model_path VGGT-1B/model_tracker_fixed_e20.pt \
--co3d_dir co3d_datasets/ \
--co3d_anno_dir co3d_v2_annotations/ \
--seed 0 \
--cache_path all_calib_data.pt \
--save_path calib_data.pt \
--class_mode all \
--kmeans_n 6 \
--kmeans_m 7
```
To quantize, calibrate, and evaluate on Co3D:
```bash
python Quant_VGGT/vggt/evaluation/run_co3d.py \
--model_path Quant_VGGT/VGGT-1B/model_tracker_fixed_e20.pt \
--co3d_dir co3d_datasets/ \
--co3d_anno_dir co3d_v2_annotations/ \
--dtype quarot_w4a4 \
--seed 0 \
--lac \
--lwc \
--cache_path calib_data.pt \
--class_mode all \
--exp_name a44_uqant \
--resume_qs
```
## Citation
If you find QuantVGGT useful for your work, please cite the following paper:
```bibtex
@article{feng2025quantized,
title={Quantized Visual Geometry Grounded Transformer},
author={Feng, Weilun and Qin, Haotong and Wu, Mingqiang and Yang, Chuanguang and Li, Yuqi and Li, Xiangqi and An, Zhulin and Huang, Libo and Zhang, Yulun and Magno, Michele and others},
journal={arXiv preprint arXiv:2509.21302},
year={2025}
}
```