Add model card for AssetFormer
#1
by
nielsr HF Staff - opened
README.md
ADDED
|
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
pipeline_tag: text-to-3d
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# AssetFormer: Modular 3D Assets Generation with Autoregressive Transformer
|
| 6 |
+
|
| 7 |
+
AssetFormer is an autoregressive Transformer-based model designed to generate modular 3D assets from textual descriptions. By adapting module sequencing and decoding techniques inspired by language models, the framework enhances the quality of 3D asset generation composed of primitives that adhere to constrained design parameters.
|
| 8 |
+
|
| 9 |
+
- **Paper:** [AssetFormer: Modular 3D Assets Generation with Autoregressive Transformer](https://huggingface.co/papers/2602.12100)
|
| 10 |
+
- **Repository:** [https://github.com/Advocate99/AssetFormer](https://github.com/Advocate99/AssetFormer)
|
| 11 |
+
|
| 12 |
+
## Installation
|
| 13 |
+
|
| 14 |
+
To get started, clone the official repository and install the dependencies:
|
| 15 |
+
|
| 16 |
+
```bash
|
| 17 |
+
git clone https://github.com/Advocate99/AssetFormer.git
|
| 18 |
+
cd AssetFormer
|
| 19 |
+
conda create -n assetformer python=3.12
|
| 20 |
+
conda activate assetformer
|
| 21 |
+
pip install -r requirements.txt
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
+
## Preparation
|
| 25 |
+
|
| 26 |
+
1. Download the `flan-t5-xl` models and place them in the `./pretrained_models/t5-ckpt/` folder:
|
| 27 |
+
```bash
|
| 28 |
+
huggingface-cli download google/flan-t5-xl --local-dir ./pretrained_models/t5-ckpt/flan-t5-xl
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
+
2. Download the `inference_model.pt` from this Hugging Face repository and place it in the `./pretrained_models/` directory.
|
| 32 |
+
|
| 33 |
+
## Inference
|
| 34 |
+
|
| 35 |
+
Run the following command to sample 3D assets as JSON files:
|
| 36 |
+
|
| 37 |
+
```bash
|
| 38 |
+
python sample.py --gpt-ckpt ./pretrained_models/inference_model.pt
|
| 39 |
+
```
|
| 40 |
+
|
| 41 |
+
After sampling, you can use the Blender scripts provided in the official repository (`./blender_script/`) to render the 3D assets with the modular fbx files.
|
| 42 |
+
|
| 43 |
+
## Citation
|
| 44 |
+
|
| 45 |
+
If you find this work useful, please kindly cite:
|
| 46 |
+
|
| 47 |
+
```bibtex
|
| 48 |
+
@article{zhu2026assetformer,
|
| 49 |
+
title={AssetFormer: Modular 3D Assets Generation with Autoregressive Transformer},
|
| 50 |
+
author={Zhu, Lingting and Qian, Shengju and Fan, Haidi and Dong, Jiayu and Jin, Zhenchao and Zhou, Siwei and Dong, Gen and Wang, Xin and Yu, Lequan},
|
| 51 |
+
journal={arXiv preprint arXiv:2602.12100},
|
| 52 |
+
year={2026}
|
| 53 |
+
}
|
| 54 |
+
```
|
| 55 |
+
|
| 56 |
+
## Acknowledgement
|
| 57 |
+
The codebase is developed based on [LlamaGen](https://github.com/FoundationVision/LlamaGen).
|