Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,71 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer (NeurIPS 2024)
|
| 6 |
+
|
| 7 |
+
|
| 8 |
+
## β¨ News
|
| 9 |
+
|
| 10 |
+
- Feb 11, 2025: π¨ We are working on the Gradio demo and will release it soon!
|
| 11 |
+
- Feb 11, 2025: π Enjoy our improved version of Direct3D with high quality geometry and texture at [https://www.neural4d.com](https://www.neural4d.com/).
|
| 12 |
+
- Feb 11, 2025: π Release inference code of Direct3D and the pretrained models are available at π€ [Hugging Face](https://huggingface.co/DreamTechAI/Direct3D/tree/main).
|
| 13 |
+
|
| 14 |
+
## π Abstract
|
| 15 |
+
|
| 16 |
+
We introduce **Direct3D**, a native 3D generative model scalable to in-the-wild input images, without requiring a multiview diffusion model or SDS optimization. Our approach comprises two primary components: a Direct 3D Variational Auto-Encoder **(D3D-VAE)** and a Direct 3D Diffusion Transformer **(D3D-DiT)**. D3D-VAE efficiently encodes high-resolution 3D shapes into a compact and continuous latent triplane space. Notably, our method directly supervises the decoded geometry using a semi-continuous surface sampling strategy, diverging from previous methods relying on rendered images as supervision signals. D3D-DiT models the distribution of encoded 3D latents and is specifically designed to fuse positional information from the three feature maps of the triplane latent, enabling a native 3D generative model scalable to large-scale 3D datasets. Additionally, we introduce an innovative image-to-3D generation pipeline incorporating semantic and pixel-level image conditions, allowing the model to produce 3D shapes consistent with the provided conditional image input. Extensive experiments demonstrate the superiority of our large-scale pre-trained Direct3D over previous image-to-3D approaches, achieving significantly better generation quality and generalization ability, thus establishing a new state-of-the-art for 3D content creation.
|
| 17 |
+
|
| 18 |
+
|
| 19 |
+
## π Getting Started
|
| 20 |
+
|
| 21 |
+
### Installation
|
| 22 |
+
|
| 23 |
+
```sh
|
| 24 |
+
git clone https://github.com/DreamTechAI/Direct3D.git
|
| 25 |
+
|
| 26 |
+
cd Direct3D
|
| 27 |
+
|
| 28 |
+
pip install -r requirements.txt
|
| 29 |
+
|
| 30 |
+
pip install -e .
|
| 31 |
+
```
|
| 32 |
+
|
| 33 |
+
### Usage
|
| 34 |
+
|
| 35 |
+
```python
|
| 36 |
+
from direct3d.pipeline import Direct3dPipeline
|
| 37 |
+
pipeline = Direct3dPipeline.from_pretrained("DreamTechAI/Direct3D")
|
| 38 |
+
pipeline.to("cuda")
|
| 39 |
+
mesh = pipeline(
|
| 40 |
+
"assets/devil.png",
|
| 41 |
+
remove_background=False, # set to True if the background of the image needs to be removed
|
| 42 |
+
mc_threshold=-1.0,
|
| 43 |
+
guidance_scale=4.0,
|
| 44 |
+
num_inference_steps=50,
|
| 45 |
+
)["meshes"][0]
|
| 46 |
+
mesh.export("output.obj")
|
| 47 |
+
```
|
| 48 |
+
|
| 49 |
+
## π€ Acknowledgements
|
| 50 |
+
|
| 51 |
+
Thanks to the following repos for their great work, which helps us a lot in the development of Direct3D:
|
| 52 |
+
|
| 53 |
+
- [3DShape2VecSet](https://github.com/1zb/3DShape2VecSet/tree/master)
|
| 54 |
+
- [Michelangelo](https://github.com/NeuralCarver/Michelangelo)
|
| 55 |
+
- [Objaverse](https://objaverse.allenai.org/)
|
| 56 |
+
- [diffusers](https://github.com/huggingface/diffusers)
|
| 57 |
+
|
| 58 |
+
## π Citation
|
| 59 |
+
|
| 60 |
+
If you find our work useful, please consider citing our paper:
|
| 61 |
+
|
| 62 |
+
```bibtex
|
| 63 |
+
@article{direct3d,
|
| 64 |
+
title={Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer},
|
| 65 |
+
author={Wu, Shuang and Lin, Youtian and Zhang, Feihu and Zeng, Yifei and Xu, Jingxi and Torr, Philip and Cao, Xun and Yao, Yao},
|
| 66 |
+
journal={arXiv preprint arXiv:2405.14832},
|
| 67 |
+
year={2024}
|
| 68 |
+
}
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
---
|