nielsr HF Staff

Add pipeline tag and improve model card documentation

c699f67 verified 3 months ago

2.4 kB

	---
	license: mit
	pipeline_tag: image-to-3d
	---

	# Quantized Visual Geometry Grounded Transformer

	[![arXiv](https://img.shields.io/badge/QuantVGGT-2509.21302-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2509.21302)
	[![GitHub](https://img.shields.io/badge/GitHub-Code-blue?style=flat-square&logo=github)](https://github.com/wlfeng0509/QuantVGGT)

	This repository contains the weights and calibration data for QuantVGGT, presented in the paper [Quantized Visual Geometry Grounded Transformer](https://arxiv.org/abs/2509.21302).

	QuantVGGT is the first quantization framework specifically designed for Visual Geometry Grounded Transformers (VGGTs). It addresses unique challenges in compressing billion-scale 3D reconstruction models, such as heavy-tailed activation distributions and multi-view calibration instability.

	## Installation

	To get started, clone the official repository and install the dependencies:

	```bash
	git clone https://github.com/wlfeng0509/QuantVGGT.git
	cd QuantVGGT
	pip install -r requirements.txt
	pip install -r requirements_demo.txt
	```

	## Quick Start

	You can use the provided scripts for inference and calibration. For example, to generate filtered Co3D calibration data:

	```bash
	python Quant_VGGT/vggt/evaluation/make_calibation.py \
	--model_path VGGT-1B/model_tracker_fixed_e20.pt \
	--co3d_dir co3d_datasets/ \
	--co3d_anno_dir co3d_v2_annotations/ \
	--seed 0 \
	--cache_path all_calib_data.pt \
	--save_path calib_data.pt \
	--class_mode all \
	--kmeans_n 6 \
	--kmeans_m 7
	```

	To quantize, calibrate, and evaluate on Co3D:

	```bash
	python Quant_VGGT/vggt/evaluation/run_co3d.py \
	--model_path Quant_VGGT/VGGT-1B/model_tracker_fixed_e20.pt \
	--co3d_dir co3d_datasets/ \
	--co3d_anno_dir co3d_v2_annotations/ \
	--dtype quarot_w4a4 \
	--seed 0 \
	--lac \
	--lwc \
	--cache_path calib_data.pt \
	--class_mode all \
	--exp_name a44_uqant \
	--resume_qs
	```

	## Citation

	If you find QuantVGGT useful for your work, please cite the following paper:

	```bibtex
	@article{feng2025quantized,
	title={Quantized Visual Geometry Grounded Transformer},
	author={Feng, Weilun and Qin, Haotong and Wu, Mingqiang and Yang, Chuanguang and Li, Yuqi and Li, Xiangqi and An, Zhulin and Huang, Libo and Zhang, Yulun and Magno, Michele and others},
	journal={arXiv preprint arXiv:2509.21302},
	year={2025}
	}
	```