File size: 1,853 Bytes
97108b7
 
0d0889b
 
97108b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
806d6ad
20fcd4c
97108b7
 
 
8783aa3
97108b7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
---
license: other
license_name: sculpt4d
license_link: https://github.com/TencentARC/Sculpt4D/blob/main/LICENSE.txt
library_name: pytorch
pipeline_tag: image-to-3d
tags:
- 4d-generation
- 3d-generation
- diffusion-transformer
- mesh-generation
- sculpt4d
---

# Sculpt4D

Pretrained model for **Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers**.

Given an image sequence of an animated object, Sculpt4D generates a temporally coherent sequence of 3D meshes. It integrates efficient temporal modeling into a pretrained 3D Diffusion Transformer ([Hunyuan3D-2.1](https://github.com/Tencent/Hunyuan3D-2.1)) via a **Block Sparse Attention** mechanism.

- 🌐 Project page: https://visual-ai.github.io/sculpt4d/
- πŸ“„ arXiv: https://arxiv.org/abs/2604.21592
- πŸ’» Code: https://github.com/TencentARC/Sculpt4D

## Checkpoint

This repository hosts the **bf16** checkpoint (~8 GB), under the `blockmask_bf16/` subfolder:

```
blockmask_bf16/
β”œβ”€β”€ pytorch_model-00001-of-00002.bin
β”œβ”€β”€ pytorch_model-00002-of-00002.bin
└── pytorch_model.bin.index.json
```

## Usage

Download the checkpoint:

```bash
huggingface-cli download TencentARC/Sculpt4D --include "blockmask_bf16/*" --local-dir checkpoints/sculpt4d
```

Run inference (see the code repository for full setup):

```bash
python inference_4d.py \
    --config configs/4d_config_8.yaml \
    --ckpt_path checkpoints/sculpt4d/blockmask_bf16 \
    --input_dir demos/door \
    --output_dir ./inference_output/door
```

## Citation

```bibtex
@inproceedings{sculpt4d2026,
  title={Sculpt4D: Generating 4D Shapes via Sparse-Attention Diffusion Transformers},
  author={Yin, Minghao and Hu, Wenbo and Xu, Jiale and Shan, Ying and Han, Kai},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2026}
}
```