| --- |
| license: other |
| license_name: sculpt4d |
| license_link: https://github.com/TencentARC/Sculpt4D/blob/main/LICENSE.txt |
| library_name: pytorch |
| pipeline_tag: image-to-3d |
| tags: |
| - 4d-generation |
| - 3d-generation |
| - diffusion-transformer |
| - mesh-generation |
| - sculpt4d |
| --- |
| |
| # Sculpt4D |
|
|
| Pretrained model for **Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers**. |
|
|
| Given an image sequence of an animated object, Sculpt4D generates a temporally coherent sequence of 3D meshes. It integrates efficient temporal modeling into a pretrained 3D Diffusion Transformer ([Hunyuan3D-2.1](https://github.com/Tencent/Hunyuan3D-2.1)) via a **Block Sparse Attention** mechanism. |
|
|
| - π Project page: https://visual-ai.github.io/sculpt4d/ |
| - π arXiv: https://arxiv.org/abs/2604.21592 |
| - π» Code: https://github.com/TencentARC/Sculpt4D |
|
|
| ## Checkpoint |
|
|
| This repository hosts the **bf16** checkpoint (~8 GB), under the `blockmask_bf16/` subfolder: |
|
|
| ``` |
| blockmask_bf16/ |
| βββ pytorch_model-00001-of-00002.bin |
| βββ pytorch_model-00002-of-00002.bin |
| βββ pytorch_model.bin.index.json |
| ``` |
|
|
| ## Usage |
|
|
| Download the checkpoint: |
|
|
| ```bash |
| huggingface-cli download TencentARC/Sculpt4D --include "blockmask_bf16/*" --local-dir checkpoints/sculpt4d |
| ``` |
|
|
| Run inference (see the code repository for full setup): |
|
|
| ```bash |
| python inference_4d.py \ |
| --config configs/4d_config_8.yaml \ |
| --ckpt_path checkpoints/sculpt4d/blockmask_bf16 \ |
| --input_dir demos/door \ |
| --output_dir ./inference_output/door |
| ``` |
|
|
| ## Citation |
|
|
| ```bibtex |
| @inproceedings{sculpt4d2026, |
| title={Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers}, |
| author={Yin, Minghao and Hu, Wenbo and Xu, Jiale and Shan, Ying and Han, Kai}, |
| booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, |
| year={2026} |
| } |
| ``` |
|
|