DreamTechAI commited on
Commit
40f1fd9
Β·
verified Β·
1 Parent(s): 3213e10

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -3
README.md CHANGED
@@ -1,3 +1,71 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+
5
+ # Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer (NeurIPS 2024)
6
+
7
+
8
+ ## ✨ News
9
+
10
+ - Feb 11, 2025: πŸ”¨ We are working on the Gradio demo and will release it soon!
11
+ - Feb 11, 2025: 🎁 Enjoy our improved version of Direct3D with high quality geometry and texture at [https://www.neural4d.com](https://www.neural4d.com/).
12
+ - Feb 11, 2025: πŸš€ Release inference code of Direct3D and the pretrained models are available at πŸ€— [Hugging Face](https://huggingface.co/DreamTechAI/Direct3D/tree/main).
13
+
14
+ ## πŸ“ Abstract
15
+
16
+ We introduce **Direct3D**, a native 3D generative model scalable to in-the-wild input images, without requiring a multiview diffusion model or SDS optimization. Our approach comprises two primary components: a Direct 3D Variational Auto-Encoder **(D3D-VAE)** and a Direct 3D Diffusion Transformer **(D3D-DiT)**. D3D-VAE efficiently encodes high-resolution 3D shapes into a compact and continuous latent triplane space. Notably, our method directly supervises the decoded geometry using a semi-continuous surface sampling strategy, diverging from previous methods relying on rendered images as supervision signals. D3D-DiT models the distribution of encoded 3D latents and is specifically designed to fuse positional information from the three feature maps of the triplane latent, enabling a native 3D generative model scalable to large-scale 3D datasets. Additionally, we introduce an innovative image-to-3D generation pipeline incorporating semantic and pixel-level image conditions, allowing the model to produce 3D shapes consistent with the provided conditional image input. Extensive experiments demonstrate the superiority of our large-scale pre-trained Direct3D over previous image-to-3D approaches, achieving significantly better generation quality and generalization ability, thus establishing a new state-of-the-art for 3D content creation.
17
+
18
+
19
+ ## πŸš€ Getting Started
20
+
21
+ ### Installation
22
+
23
+ ```sh
24
+ git clone https://github.com/DreamTechAI/Direct3D.git
25
+
26
+ cd Direct3D
27
+
28
+ pip install -r requirements.txt
29
+
30
+ pip install -e .
31
+ ```
32
+
33
+ ### Usage
34
+
35
+ ```python
36
+ from direct3d.pipeline import Direct3dPipeline
37
+ pipeline = Direct3dPipeline.from_pretrained("DreamTechAI/Direct3D")
38
+ pipeline.to("cuda")
39
+ mesh = pipeline(
40
+ "assets/devil.png",
41
+ remove_background=False, # set to True if the background of the image needs to be removed
42
+ mc_threshold=-1.0,
43
+ guidance_scale=4.0,
44
+ num_inference_steps=50,
45
+ )["meshes"][0]
46
+ mesh.export("output.obj")
47
+ ```
48
+
49
+ ## πŸ€— Acknowledgements
50
+
51
+ Thanks to the following repos for their great work, which helps us a lot in the development of Direct3D:
52
+
53
+ - [3DShape2VecSet](https://github.com/1zb/3DShape2VecSet/tree/master)
54
+ - [Michelangelo](https://github.com/NeuralCarver/Michelangelo)
55
+ - [Objaverse](https://objaverse.allenai.org/)
56
+ - [diffusers](https://github.com/huggingface/diffusers)
57
+
58
+ ## πŸ“– Citation
59
+
60
+ If you find our work useful, please consider citing our paper:
61
+
62
+ ```bibtex
63
+ @article{direct3d,
64
+ title={Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer},
65
+ author={Wu, Shuang and Lin, Youtian and Zhang, Feihu and Zeng, Yifei and Xu, Jingxi and Torr, Philip and Cao, Xun and Yao, Yao},
66
+ journal={arXiv preprint arXiv:2405.14832},
67
+ year={2024}
68
+ }
69
+ ```
70
+
71
+ ---