ltzhu
/

AssetFormer

Model card Files Files and versions

AssetFormer / README.md

nielsr's picture

nielsr HF Staff

Add model card for AssetFormer

0aa78cd verified 14 days ago

|

2.15 kB

	---
	pipeline_tag: text-to-3d
	---

	# AssetFormer: Modular 3D Assets Generation with Autoregressive Transformer

	AssetFormer is an autoregressive Transformer-based model designed to generate modular 3D assets from textual descriptions. By adapting module sequencing and decoding techniques inspired by language models, the framework enhances the quality of 3D asset generation composed of primitives that adhere to constrained design parameters.

	- Paper: [AssetFormer: Modular 3D Assets Generation with Autoregressive Transformer](https://huggingface.co/papers/2602.12100)
	- Repository: [https://github.com/Advocate99/AssetFormer](https://github.com/Advocate99/AssetFormer)

	## Installation

	To get started, clone the official repository and install the dependencies:

	```bash
	git clone https://github.com/Advocate99/AssetFormer.git
	cd AssetFormer
	conda create -n assetformer python=3.12
	conda activate assetformer
	pip install -r requirements.txt
	```

	## Preparation

	1. Download the `flan-t5-xl` models and place them in the `./pretrained_models/t5-ckpt/` folder:
	```bash
	huggingface-cli download google/flan-t5-xl --local-dir ./pretrained_models/t5-ckpt/flan-t5-xl
	```

	2. Download the `inference_model.pt` from this Hugging Face repository and place it in the `./pretrained_models/` directory.

	## Inference

	Run the following command to sample 3D assets as JSON files:

	```bash
	python sample.py --gpt-ckpt ./pretrained_models/inference_model.pt
	```

	After sampling, you can use the Blender scripts provided in the official repository (`./blender_script/`) to render the 3D assets with the modular fbx files.

	## Citation

	If you find this work useful, please kindly cite:

	```bibtex
	@article{zhu2026assetformer,
	title={AssetFormer: Modular 3D Assets Generation with Autoregressive Transformer},
	author={Zhu, Lingting and Qian, Shengju and Fan, Haidi and Dong, Jiayu and Jin, Zhenchao and Zhou, Siwei and Dong, Gen and Wang, Xin and Yu, Lequan},
	journal={arXiv preprint arXiv:2602.12100},
	year={2026}
	}
	```

	## Acknowledgement
	The codebase is developed based on [LlamaGen](https://github.com/FoundationVision/LlamaGen).