VAST-AI
/

LegoACE

+---
+license: mit
+library_name: transformers
+pipeline_tag: image-to-3d
+tags:
+- lego
+- 3d-generation
+- autoregressive
+- transformer
+- llama
+- dinov2
+- clip
+- siggraph-asia-2025
+---
+# LegoACE: Autoregressive Construction Engine for Expressive LEGO® Assemblies
+Official model weights for **LegoACE**, presented at **SIGGRAPH Asia 2025**.
+LegoACE is an autoregressive transformer that generates LEGO® assemblies as
+sequences of placed bricks. This repository hosts two pretrained variants:
+| Subfolder | Conditioning | Encoder | Training steps |
+|-----------|--------------|---------|----------------|
+| `mv/`     | Multi-view images (4 views) | [DINOv2-base](https://huggingface.co/facebook/dinov2-base) | 520K |
+| `text/`   | Text descriptions           | [CLIP ViT-B/32](https://huggingface.co/openai/clip-vit-base-patch32) | 210K |
+- 📄 Paper: [LegoACE @ SIGGRAPH Asia 2025](https://doi.org/10.1145/3757377.3763881)
+- 💻 Code: [VAST-AI-Research/LegoACE](https://github.com/VAST-AI-Research/LegoACE)
+- 📊 Architecture: 32-layer Llama-style transformer, hidden size 768, vocab ~16K
+---
+## Quick start
+> Full inference pipeline (LDR tokenizer, multi-view rendering, LDR → GLB
+> conversion) lives in the [GitHub repository](https://github.com/VAST-AI-Research/LegoACE).
+> The snippets below show only how to load the weights.
+```bash
+git clone https://github.com/VAST-AI-Research/LegoACE.git
+cd LegoACE
+pip install -e .
+```
+### Multi-view image conditioned (recommended)
+```python
+from model.llama_image_condition import ImageConditionModel
+model = ImageConditionModel.from_pretrained("VAST-AI/LegoACE", subfolder="mv").to("cuda")
+```
+End-to-end usage with the `dataset/MVNpzDataset.py` loader and Blender-based
+GLB export is documented in the GitHub README:
+```bash
+python inference/inference_multi_view.py \
+    --ckpt_dir VAST-AI/LegoACE \
+    --dataset_name <your_dataset> \
+    --dataset_class dataset.MVNpzDataset.MVNpzDataset \
+    --save_dir ./outputs/inference \
+    --save_name mv-demo \
+    --infer_number 100 --batch_size 4 --repeat 4 --dataset_split val
+```
+### Text conditioned
+```python
+from model.llama_text_condition import TextConditionModel
+model = TextConditionModel.from_pretrained("VAST-AI/LegoACE", subfolder="text").to("cuda")
+```
+```bash
+python inference/inference_text_condition.py \
+    --ckpt_dir VAST-AI/LegoACE \
+    --dataset_name <your_dataset> \
+    --save_dir ./outputs/inference --save_name text-demo \
+    --prompts "A red sports car" "A modern brick bed" "A bridge over a river"
+```
+---
+## Outputs
+Each generation step emits a quintuple `(x, y, z, rotation_id, brick_type_id)`.
+The full pipeline converts those token sequences into:
+1. **LDR** — text-format LEGO instructions (LDraw)
+2. **GLB** — 3D mesh via Blender + [ImportLDraw](https://github.com/TobyLobster/ImportLDraw)
+3. **Normal maps** — pyrender renderings of the assembled model
+LegoACE supports an LDR vocabulary covering 28 common brick types and 20
+discrete rotation classes; see [`utils/brick_ids.py`](https://github.com/VAST-AI-Research/LegoACE/blob/main/utils/brick_ids.py).
+---
+## Intended uses & limitations
+**Intended uses**
+- Research on autoregressive 3D / LEGO® generative models.
+- Generating LEGO assemblies for academic and creative exploration.
+**Limitations**
+- Outputs are restricted to the 28-brick vocabulary used in training.
+- Quality depends on prompt phrasing (text) or image quality (multi-view).
+- The model has been trained primarily on small/medium-scale assemblies and
+  may produce structurally unstable or non-buildable arrangements.
+- Generation requires the LDR tokenizer files (`*_dat_dict.json`,
+  `*_rot_dict.json`) that ship with the dataset, not with these weights.
+---
+## Citation
+```bibtex
+@inproceedings{xu2025legoace,
+  author    = {Hao Xu and Yuqing Zhang and Yiqian Wu and Xinyang Zheng and
+               Yutao Liu and Xiangjun Tang and Yunhan Yang and Ding Liang and
+               Yingtian Liu and Yuanchen Guo and Yanpei Cao and Xiaogang Jin},
+  title     = {LegoACE: Autoregressive Construction Engine for Expressive LEGO{\textregistered}
+               Assemblies},
+  booktitle = {Proceedings of the {SIGGRAPH} Asia 2025 Conference Papers},
+  publisher = {{ACM}},
+  year      = {2025},
+  pages     = {40:1--40:11},
+  doi       = {10.1145/3757377.3763881},
+  url       = {https://doi.org/10.1145/3757377.3763881}
+}
+```
+---
+## License
+Released under the [MIT License](https://github.com/VAST-AI-Research/LegoACE/blob/main/LICENSE).
+LEGO® is a trademark of the LEGO Group, which does not sponsor, authorize, or
+endorse this project.