Update README.md
Browse files
README.md
CHANGED
|
@@ -1,18 +1,18 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
-
license_name: pixal3d-license
|
| 4 |
license_link: LICENSE
|
| 5 |
extra_gated_eu_disallowed: true
|
| 6 |
pipeline_tag: image-to-3d
|
| 7 |
---
|
| 8 |
|
|
|
|
| 9 |
<div align="center">
|
| 10 |
|
| 11 |
# Pixal3D: Pixel-Aligned 3D Generation from Images
|
| 12 |
|
| 13 |
<h3>SIGGRAPH 2026</h3>
|
| 14 |
|
| 15 |
-
[Dong-Yang Li](https://ldyang694.github.io/)ΒΉ Β· [Wang Zhao](https://thuzhaowang.github.io/)Β²* Β· [Yuxin Chen](https://orcid.org/0000-0002-7854-1072)Β² Β· [Wenbo Hu](https://wbhu.github.io/)Β² Β· [Meng-Hao Guo](https://menghaoguo.github.io/)ΒΉ Β· [Fang-Lue Zhang](https://fanglue.github.io/)Β³ Β· [Ying Shan](https://www.linkedin.com/in/YingShanProfile)Β² Β· [Shi-Min Hu](https://cg.cs.tsinghua.edu.cn/shimin.htm)ΒΉβ
|
| 16 |
|
| 17 |
ΒΉTsinghua University (BNRist) Β²Tencent ARC Lab Β³Victoria University of Wellington
|
| 18 |
|
|
@@ -23,16 +23,19 @@ pipeline_tag: image-to-3d
|
|
| 23 |
<div align="center">
|
| 24 |
<a href="https://ldyang694.github.io/projects/pixal3d/"><img src=https://img.shields.io/badge/Project%20Page-333399.svg?logo=googlehome height=22px></a>
|
| 25 |
<a href="https://huggingface.co/spaces/TencentARC/Pixal3D"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Demo-276cb4.svg height=22px></a>
|
| 26 |
-
<a href="https://
|
| 27 |
<a href="https://arxiv.org/abs/2605.10922"><img src=https://img.shields.io/badge/Arxiv-b5212f.svg?logo=arxiv height=22px></a>
|
|
|
|
| 28 |
</div>
|
| 29 |
|
|
|
|
| 30 |
**Pixal3D** generates high-fidelity 3D assets from a single image. Unlike previous methods that loosely inject image features via attention, Pixal3D explicitly lifts pixel features into 3D through back-projection, establishing direct pixel-to-3D correspondences. This enables near-reconstruction-level fidelity with detailed geometry and PBR textures.
|
| 31 |
|
| 32 |
---
|
| 33 |
|
| 34 |
## β¨ News
|
| 35 |
|
|
|
|
| 36 |
- **May 2026**: Release the improved version based on [Trellis.2](https://github.com/microsoft/TRELLIS.2) backbone. πͺ
|
| 37 |
- **May 2026**: Release inference code and online demo. π€
|
| 38 |
- **Apr 2026**: Our paper is accepted to SIGGRAPH 2026! π
|
|
@@ -66,12 +69,22 @@ Please first follow the installation guide of [TRELLIS.2](https://github.com/mic
|
|
| 66 |
pip install -r requirements.txt
|
| 67 |
```
|
| 68 |
|
| 69 |
-
#### Step 3: Install
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 70 |
|
| 71 |
```bash
|
| 72 |
pip install https://github.com/LDYang694/Storages/releases/download/20260430/utils3d-0.0.2-py3-none-any.whl
|
| 73 |
```
|
| 74 |
|
|
|
|
|
|
|
| 75 |
### Usage
|
| 76 |
|
| 77 |
#### Inference
|
|
@@ -79,9 +92,30 @@ pip install https://github.com/LDYang694/Storages/releases/download/20260430/uti
|
|
| 79 |
Generate a GLB mesh from a single image:
|
| 80 |
|
| 81 |
```bash
|
| 82 |
-
python inference.py --image assets/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
```
|
| 84 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 85 |
### Web Demo
|
| 86 |
|
| 87 |
We provide a Gradio web demo for Pixal3D, which allows you to generate 3D meshes from images interactively.
|
|
@@ -90,9 +124,164 @@ We provide a Gradio web demo for Pixal3D, which allows you to generate 3D meshes
|
|
| 90 |
python app.py
|
| 91 |
```
|
| 92 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
## π€ Acknowledgements
|
| 94 |
|
| 95 |
-
This project is heavily built upon [Trellis.2](https://github.com/microsoft/TRELLIS.2) and [Direct3D-S2](https://github.com/DreamTechAI/Direct3D-S2). We
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 96 |
|
| 97 |
## π Citation
|
| 98 |
|
|
@@ -100,9 +289,13 @@ If you find this work useful, please consider citing:
|
|
| 100 |
|
| 101 |
```bibtex
|
| 102 |
@article{li2026pixal3d,
|
| 103 |
-
title
|
| 104 |
-
author
|
| 105 |
-
journal
|
| 106 |
-
year
|
| 107 |
}
|
| 108 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
|
|
|
| 3 |
license_link: LICENSE
|
| 4 |
extra_gated_eu_disallowed: true
|
| 5 |
pipeline_tag: image-to-3d
|
| 6 |
---
|
| 7 |
|
| 8 |
+
|
| 9 |
<div align="center">
|
| 10 |
|
| 11 |
# Pixal3D: Pixel-Aligned 3D Generation from Images
|
| 12 |
|
| 13 |
<h3>SIGGRAPH 2026</h3>
|
| 14 |
|
| 15 |
+
<small>[Dong-Yang Li](https://ldyang694.github.io/)ΒΉ Β· [Wang Zhao](https://thuzhaowang.github.io/)Β²* Β· [Yuxin Chen](https://orcid.org/0000-0002-7854-1072)Β² Β· [Wenbo Hu](https://wbhu.github.io/)Β² Β· [Meng-Hao Guo](https://menghaoguo.github.io/)ΒΉ Β· [Fang-Lue Zhang](https://fanglue.github.io/)Β³ Β· [Ying Shan](https://www.linkedin.com/in/YingShanProfile)Β² Β· [Shi-Min Hu](https://cg.cs.tsinghua.edu.cn/shimin.htm)ΒΉβ</small>
|
| 16 |
|
| 17 |
ΒΉTsinghua University (BNRist) Β²Tencent ARC Lab Β³Victoria University of Wellington
|
| 18 |
|
|
|
|
| 23 |
<div align="center">
|
| 24 |
<a href="https://ldyang694.github.io/projects/pixal3d/"><img src=https://img.shields.io/badge/Project%20Page-333399.svg?logo=googlehome height=22px></a>
|
| 25 |
<a href="https://huggingface.co/spaces/TencentARC/Pixal3D"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Demo-276cb4.svg height=22px></a>
|
| 26 |
+
<a href="https://huggingface.co/TencentARC/Pixal3D"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
|
| 27 |
<a href="https://arxiv.org/abs/2605.10922"><img src=https://img.shields.io/badge/Arxiv-b5212f.svg?logo=arxiv height=22px></a>
|
| 28 |
+
<a href="LICENSE"><img src=https://img.shields.io/badge/License-MIT-yellow.svg height=22px></a>
|
| 29 |
</div>
|
| 30 |
|
| 31 |
+
|
| 32 |
**Pixal3D** generates high-fidelity 3D assets from a single image. Unlike previous methods that loosely inject image features via attention, Pixal3D explicitly lifts pixel features into 3D through back-projection, establishing direct pixel-to-3D correspondences. This enables near-reconstruction-level fidelity with detailed geometry and PBR textures.
|
| 33 |
|
| 34 |
---
|
| 35 |
|
| 36 |
## β¨ News
|
| 37 |
|
| 38 |
+
- **May 2026**: Release training code and data preparation toolkit. π§
|
| 39 |
- **May 2026**: Release the improved version based on [Trellis.2](https://github.com/microsoft/TRELLIS.2) backbone. πͺ
|
| 40 |
- **May 2026**: Release inference code and online demo. π€
|
| 41 |
- **Apr 2026**: Our paper is accepted to SIGGRAPH 2026! π
|
|
|
|
| 69 |
pip install -r requirements.txt
|
| 70 |
```
|
| 71 |
|
| 72 |
+
#### Step 3: Install natten
|
| 73 |
+
|
| 74 |
+
```bash
|
| 75 |
+
NATTEN_CUDA_ARCH="xx" NATTEN_N_WORKERS=xx pip install natten==0.21.0 --no-build-isolation
|
| 76 |
+
```
|
| 77 |
+
|
| 78 |
+
Please replace `xx` with the CUDA architecture and the number of build workers suitable for your machine.
|
| 79 |
+
|
| 80 |
+
#### Step 4: Install utils3d
|
| 81 |
|
| 82 |
```bash
|
| 83 |
pip install https://github.com/LDYang694/Storages/releases/download/20260430/utils3d-0.0.2-py3-none-any.whl
|
| 84 |
```
|
| 85 |
|
| 86 |
+
> **Note**: `requirements-hfdemo.txt` is for the Hugging Face Spaces demo (H-series GPU architecture) and may not be compatible with other architectures.
|
| 87 |
+
|
| 88 |
### Usage
|
| 89 |
|
| 90 |
#### Inference
|
|
|
|
| 92 |
Generate a GLB mesh from a single image:
|
| 93 |
|
| 94 |
```bash
|
| 95 |
+
python inference.py --image assets/images/0_img.png --output ./output.glb
|
| 96 |
+
```
|
| 97 |
+
|
| 98 |
+
**Low-VRAM mode** (reduces peak VRAM by loading models on-demand):
|
| 99 |
+
|
| 100 |
+
```bash
|
| 101 |
+
python inference.py --image assets/images/0_img.png --output ./output.glb --low_vram
|
| 102 |
+
```
|
| 103 |
+
|
| 104 |
+
By default, the pipeline resolution is **1536** (standard mode) or **1024** (low-VRAM mode). You can override this with `--resolution`:
|
| 105 |
+
|
| 106 |
+
```bash
|
| 107 |
+
# Force 1536 even in low-VRAM mode
|
| 108 |
+
python inference.py --image assets/images/0_img.png --output ./output.glb --low_vram --resolution 1536
|
| 109 |
+
|
| 110 |
+
# Force 1024 in standard mode
|
| 111 |
+
python inference.py --image assets/images/0_img.png --output ./output.glb --resolution 1024
|
| 112 |
```
|
| 113 |
|
| 114 |
+
**Tip**: If you don't have `flash_attn` installed, you can use PyTorch's built-in SDPA backend instead:
|
| 115 |
+
> ```bash
|
| 116 |
+
> ATTN_BACKEND=sdpa python inference.py --image assets/images/0_img.png --output ./output.glb --low_vram
|
| 117 |
+
> ```
|
| 118 |
+
|
| 119 |
### Web Demo
|
| 120 |
|
| 121 |
We provide a Gradio web demo for Pixal3D, which allows you to generate 3D meshes from images interactively.
|
|
|
|
| 124 |
python app.py
|
| 125 |
```
|
| 126 |
|
| 127 |
+
Low-VRAM mode is also available for the web demo. The frontend default resolution will automatically switch to 1024 in low-VRAM mode (1536 otherwise), but can be changed manually in the UI.
|
| 128 |
+
|
| 129 |
+
```bash
|
| 130 |
+
python app.py --low_vram
|
| 131 |
+
# or via environment variable:
|
| 132 |
+
LOW_VRAM=1 python app.py
|
| 133 |
+
```
|
| 134 |
+
## π§ Training
|
| 135 |
+
|
| 136 |
+
We provide the full training codebase for reproducing Pixal3D from scratch.
|
| 137 |
+
|
| 138 |
+
### Data Preparation
|
| 139 |
+
|
| 140 |
+
Prepare view-aligned O-Voxel data and rendered condition images by following the data toolkit instructions:
|
| 141 |
+
|
| 142 |
+
> π **[data_toolkit/README.md](data_toolkit/README.md)**
|
| 143 |
+
|
| 144 |
+
### Overview
|
| 145 |
+
|
| 146 |
+
Pixal3D is trained as a three-stage cascade, each progressively increasing resolution:
|
| 147 |
+
|
| 148 |
+
| Stage | Model | Resolutions | Config Prefix |
|
| 149 |
+
|-------|-------|-------------|---------------|
|
| 150 |
+
| 1 | Sparse Structure | 32 β 64 | `ss_flow_img_dit_*_proj_finetune` |
|
| 151 |
+
| 2 | Shape | 256 β 512 β 1024 | `slat_flow_img2shape_*_proj_finetune` |
|
| 152 |
+
| 3 | Texture | 256 β 512 β 1024 | `slat_flow_imgshape2tex_*_proj_finetune` |
|
| 153 |
+
|
| 154 |
+
All stages use **pixel-aligned projection conditioning** and **view-aligned latents** (2 views by default). Within each stage, start from the lowest resolution and progressively fine-tune to higher resolutions by setting `finetune_ckpt` in the config.
|
| 155 |
+
|
| 156 |
+
### Quick Start
|
| 157 |
+
|
| 158 |
+
```sh
|
| 159 |
+
python train.py \
|
| 160 |
+
--config <CONFIG_JSON> \
|
| 161 |
+
--output_dir <OUTPUT_DIR> \
|
| 162 |
+
--data_dir '<DATA_DIR_JSON>'
|
| 163 |
+
```
|
| 164 |
+
|
| 165 |
+
`--data_dir` is a JSON string describing the dataset layout. Different stages require different keys:
|
| 166 |
+
|
| 167 |
+
| Stage | Required keys |
|
| 168 |
+
|-------|---------------|
|
| 169 |
+
| Sparse Structure | `base`, `ss_latent`, `render_cond` |
|
| 170 |
+
| Shape | `base`, `shape_latent`, `render_cond` |
|
| 171 |
+
| Texture | `base`, `shape_latent`, `pbr_latent`, `render_cond` |
|
| 172 |
+
|
| 173 |
+
### Example: Training All Three Stages
|
| 174 |
+
|
| 175 |
+
Below we show the full training sequence using ObjaverseXL as an example. Each higher-resolution step requires updating `finetune_ckpt` in its config JSON to point to the previous checkpoint.
|
| 176 |
+
|
| 177 |
+
<details>
|
| 178 |
+
<summary><b>Stage 1: Sparse Structure (32 β 64)</b></summary>
|
| 179 |
+
|
| 180 |
+
```sh
|
| 181 |
+
# Resolution 32
|
| 182 |
+
python train.py \
|
| 183 |
+
--config configs/gen/ss_flow_img_dit_1_3B_32_bf16_proj_finetune.json \
|
| 184 |
+
--output_dir results/ss_32 \
|
| 185 |
+
--data_dir '{"ObjaverseXL_sketchfab": {"base": "datasets/ObjaverseXL_sketchfab", "ss_latent": "datasets/ObjaverseXL_sketchfab/ss_latents/ss_enc_conv3d_16l8_fp16_64_view", "render_cond": "datasets/ObjaverseXL_sketchfab/renders_cond"}}'
|
| 186 |
+
|
| 187 |
+
# Resolution 64 (set finetune_ckpt β results/ss_32 checkpoint)
|
| 188 |
+
python train.py \
|
| 189 |
+
--config configs/gen/ss_flow_img_dit_1_3B_32_bf16_proj_finetune_ft64.json \
|
| 190 |
+
--output_dir results/ss_ft64 \
|
| 191 |
+
--data_dir '{"ObjaverseXL_sketchfab": {"base": "datasets/ObjaverseXL_sketchfab", "ss_latent": "datasets/ObjaverseXL_sketchfab/ss_latents/ss_enc_conv3d_16l8_fp16_64_view", "render_cond": "datasets/ObjaverseXL_sketchfab/renders_cond"}}'
|
| 192 |
+
```
|
| 193 |
+
</details>
|
| 194 |
+
|
| 195 |
+
<details>
|
| 196 |
+
<summary><b>Stage 2: Shape (256 β 512 β 1024)</b></summary>
|
| 197 |
+
|
| 198 |
+
```sh
|
| 199 |
+
# Resolution 256
|
| 200 |
+
python train.py \
|
| 201 |
+
--config configs/gen/slat_flow_img2shape_dit_1_3B_256_bf16_proj_finetune.json \
|
| 202 |
+
--output_dir results/shape_256 \
|
| 203 |
+
--data_dir '{"ObjaverseXL_sketchfab": {"base": "datasets/ObjaverseXL_sketchfab", "shape_latent": "datasets/ObjaverseXL_sketchfab/shape_latents/shape_enc_next_dc_f16c32_fp16_256_view", "render_cond": "datasets/ObjaverseXL_sketchfab/renders_cond"}}'
|
| 204 |
+
|
| 205 |
+
# Resolution 512
|
| 206 |
+
python train.py \
|
| 207 |
+
--config configs/gen/slat_flow_img2shape_dit_1_3B_256_bf16_proj_finetune_ft512.json \
|
| 208 |
+
--output_dir results/shape_ft512 \
|
| 209 |
+
--data_dir '{"ObjaverseXL_sketchfab": {"base": "datasets/ObjaverseXL_sketchfab", "shape_latent": "datasets/ObjaverseXL_sketchfab/shape_latents/shape_enc_next_dc_f16c32_fp16_512_view", "render_cond": "datasets/ObjaverseXL_sketchfab/renders_cond"}}'
|
| 210 |
+
|
| 211 |
+
# Resolution 1024
|
| 212 |
+
python train.py \
|
| 213 |
+
--config configs/gen/slat_flow_img2shape_dit_1_3B_512_bf16_proj_finetune_ft1024.json \
|
| 214 |
+
--output_dir results/shape_ft1024 \
|
| 215 |
+
--data_dir '{"ObjaverseXL_sketchfab": {"base": "datasets/ObjaverseXL_sketchfab", "shape_latent": "datasets/ObjaverseXL_sketchfab/shape_latents/shape_enc_next_dc_f16c32_fp16_1024_view", "render_cond": "datasets/ObjaverseXL_sketchfab/renders_cond"}}'
|
| 216 |
+
```
|
| 217 |
+
</details>
|
| 218 |
+
|
| 219 |
+
<details>
|
| 220 |
+
<summary><b>Stage 3: Texture (256 β 512 β 1024)</b></summary>
|
| 221 |
+
|
| 222 |
+
```sh
|
| 223 |
+
# Resolution 256
|
| 224 |
+
python train.py \
|
| 225 |
+
--config configs/gen/slat_flow_imgshape2tex_dit_1_3B_256_bf16_proj_finetune.json \
|
| 226 |
+
--output_dir results/tex_256 \
|
| 227 |
+
--data_dir '{"ObjaverseXL_sketchfab": {"base": "datasets/ObjaverseXL_sketchfab", "shape_latent": "datasets/ObjaverseXL_sketchfab/shape_latents/shape_enc_next_dc_f16c32_fp16_256_view", "pbr_latent": "datasets/ObjaverseXL_sketchfab/pbr_latents/tex_enc_next_dc_f16c32_fp16_256_view", "render_cond": "datasets/ObjaverseXL_sketchfab/renders_cond"}}'
|
| 228 |
+
|
| 229 |
+
# Resolution 512
|
| 230 |
+
python train.py \
|
| 231 |
+
--config configs/gen/slat_flow_imgshape2tex_dit_1_3B_512_bf16_proj_finetune.json \
|
| 232 |
+
--output_dir results/tex_512 \
|
| 233 |
+
--data_dir '{"ObjaverseXL_sketchfab": {"base": "datasets/ObjaverseXL_sketchfab", "shape_latent": "datasets/ObjaverseXL_sketchfab/shape_latents/shape_enc_next_dc_f16c32_fp16_512_view", "pbr_latent": "datasets/ObjaverseXL_sketchfab/pbr_latents/tex_enc_next_dc_f16c32_fp16_512_view", "render_cond": "datasets/ObjaverseXL_sketchfab/renders_cond"}}'
|
| 234 |
+
|
| 235 |
+
# Resolution 1024
|
| 236 |
+
python train.py \
|
| 237 |
+
--config configs/gen/slat_flow_imgshape2tex_dit_1_3B_512_bf16_proj_finetune_ft1024.json \
|
| 238 |
+
--output_dir results/tex_ft1024 \
|
| 239 |
+
--data_dir '{"ObjaverseXL_sketchfab": {"base": "datasets/ObjaverseXL_sketchfab", "shape_latent": "datasets/ObjaverseXL_sketchfab/shape_latents/shape_enc_next_dc_f16c32_fp16_1024_view", "pbr_latent": "datasets/ObjaverseXL_sketchfab/pbr_latents/tex_enc_next_dc_f16c32_fp16_1024_view", "render_cond": "datasets/ObjaverseXL_sketchfab/renders_cond"}}'
|
| 240 |
+
```
|
| 241 |
+
</details>
|
| 242 |
+
|
| 243 |
+
### Additional Options
|
| 244 |
+
|
| 245 |
+
<details>
|
| 246 |
+
<summary><b>All command-line arguments</b></summary>
|
| 247 |
+
|
| 248 |
+
| Argument | Description | Default |
|
| 249 |
+
|----------|-------------|---------|
|
| 250 |
+
| `--config` | Config JSON path | *required* |
|
| 251 |
+
| `--output_dir` | Output directory | *required* |
|
| 252 |
+
| `--data_dir` | Dataset JSON string | `./data/` |
|
| 253 |
+
| `--load_dir` | Checkpoint load directory | `output_dir` |
|
| 254 |
+
| `--ckpt` | Resume from step | `latest` |
|
| 255 |
+
| `--auto_retry` | Retries on failure | `3` |
|
| 256 |
+
| `--tryrun` | Dry run | `false` |
|
| 257 |
+
| `--profile` | Profiling | `false` |
|
| 258 |
+
| `--num_nodes` | Number of nodes | `1` |
|
| 259 |
+
| `--node_rank` | Current node rank | `0` |
|
| 260 |
+
| `--num_gpus` | GPUs per node | all |
|
| 261 |
+
| `--master_addr` | Master address | `localhost` |
|
| 262 |
+
| `--master_port` | Master port | `12666` |
|
| 263 |
+
| `--use_wandb` | Enable W&B logging | `false` |
|
| 264 |
+
| `--wandb_project` | W&B project | `trellis2-training` |
|
| 265 |
+
| `--wandb_name` | W&B run name | basename of `output_dir` |
|
| 266 |
+
| `--wandb_id` | W&B run ID (resume) | β |
|
| 267 |
+
|
| 268 |
+
</details>
|
| 269 |
+
|
| 270 |
+
## π Community Projects
|
| 271 |
+
|
| 272 |
+
We thank the community for building extensions and deployment guides for Pixal3D!
|
| 273 |
+
|
| 274 |
+
- [Pixal3D-ComfyUI](https://github.com/Saganaki22/Pixal3D-ComfyUI) β ComfyUI integration with deployment guides for Windows, WSL, and more.
|
| 275 |
+
|
| 276 |
## π€ Acknowledgements
|
| 277 |
|
| 278 |
+
This project is heavily built upon [Trellis.2](https://github.com/microsoft/TRELLIS.2) and [Direct3D-S2](https://github.com/DreamTechAI/Direct3D-S2). We sincerely thank the authors for their outstanding work on scalable 3D generation , which serves as the foundation of our codebase and model architecture.
|
| 279 |
+
|
| 280 |
+
We also thank the following repos for their great contributions:
|
| 281 |
+
|
| 282 |
+
- [Direct3D-S2](https://github.com/DreamTechAI/Direct3D-S2)
|
| 283 |
+
- [Trellis](https://github.com/microsoft/TRELLIS)
|
| 284 |
+
- [Trellis.2](https://github.com/microsoft/TRELLIS.2)
|
| 285 |
|
| 286 |
## π Citation
|
| 287 |
|
|
|
|
| 289 |
|
| 290 |
```bibtex
|
| 291 |
@article{li2026pixal3d,
|
| 292 |
+
title={Pixal3D: Pixel-Aligned 3D Generation from Images},
|
| 293 |
+
author={Li, Dong-Yang and Zhao, Wang and Chen, Yuxin and Hu, Wenbo and Guo, Meng-Hao and Zhang, Fang-Lue and Shan, Ying and Hu, Shi-Min},
|
| 294 |
+
journal={arXiv preprint arXiv:2605.10922},
|
| 295 |
+
year={2026}
|
| 296 |
}
|
| 297 |
+
```
|
| 298 |
+
|
| 299 |
+
## π License
|
| 300 |
+
|
| 301 |
+
This project is released under the [MIT License](LICENSE). The third-party components included in this project remain licensed under their respective original terms; see [NOTICE](NOTICE) for the full list of dependencies and their licenses.
|