WasabiOctopus commited on
Commit
76bfabb
·
verified ·
1 Parent(s): eea05f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -83
README.md CHANGED
@@ -1,10 +1,10 @@
1
  ---
2
-
3
  license: mit
4
  pipeline_tag: image-to-3d
5
  library_name: diffusers
6
- tags: ["image-to-3d", "text-to-3d", "3d-generation", "3d-gaussian-splatting", "gaussian-splatting", "multi-view-diffusion", "diffusers", "safetensors", "research"]
7
-
 
8
  ---
9
 
10
  <div align="center">
@@ -21,68 +21,67 @@ tags: ["image-to-3d", "text-to-3d", "3d-generation", "3d-gaussian-splatting", "g
21
  <img src="https://img.shields.io/badge/License-MIT-green">
22
  </p>
23
 
24
- **A clean Diffusers-ready LGM release for fast 3D content creation from text or a single image.**
25
 
26
  </div>
27
 
28
- ---
29
-
30
  ## ✨ Highlights
31
 
32
- * 🚀 Fast 3D asset generation powered by the LGM pipeline.
33
- * 🧊 3D Gaussian Splatting representation for efficient 3D content creation.
34
- * 🖼️ Supports text-to-3D and image-to-3D workflows.
35
- * 🧩 Diffusers-compatible model structure.
36
- * 🔬 Useful for 3D generation research, creative prototyping, and rapid experimentation.
37
-
38
- ---
39
 
40
  ## 🖼️ Gallery
41
 
42
- Generated examples will be added soon.
43
-
44
- | Prompt / Input | Generated 3D Asset |
45
- | ----------------------------------------------------- | ------------------ |
46
- | `a cute robot, smooth toy material, studio lighting` | Coming soon |
47
- | `a fantasy treasure chest with golden details` | Coming soon |
48
- | `a stylized sci-fi helmet, clean hard-surface design` | Coming soon |
49
 
50
- ---
 
 
 
 
51
 
52
  ## 🧠 What is LGM?
53
 
54
- **LGM**, short for **Large Multi-View Gaussian Model**, is a 3D generation framework for high-resolution 3D content creation.
55
 
56
- Instead of directly generating a mesh from scratch, the pipeline first produces multi-view visual information and then reconstructs a 3D Gaussian representation. This makes it suitable for fast feed-forward 3D asset generation from either a text prompt or a single input image.
57
 
58
  This repository provides a convenient Hugging Face / Diffusers-style release of the full LGM pipeline.
59
 
60
- ---
61
-
62
  ## 🏗️ Pipeline Overview
63
 
 
64
  Text prompt or single image
65
- → Multi-view diffusion generation
66
- Multi-view Gaussian features
67
- → LGM reconstruction module
68
- 3D Gaussian asset
69
- → PLY export / downstream rendering
70
-
71
- ---
 
 
 
 
72
 
73
  ## 🚀 Quick Start
74
 
75
  ### 1. Install dependencies
76
 
77
- ```
78
  pip install -U diffusers transformers accelerate safetensors
79
  pip install torch torchvision torchaudio
80
  pip install xformers trimesh kiui plyfile
81
  ```
82
 
 
 
83
  ### 2. Load the pipeline
84
 
85
- ```
86
  import torch
87
  from diffusers import DiffusionPipeline
88
 
@@ -99,7 +98,7 @@ pipe = pipe.to("cuda")
99
 
100
  ### 3. Text-to-3D generation
101
 
102
- ```
103
  prompt = "a cute robot, smooth toy material, studio lighting, clean geometry"
104
 
105
  gaussians = pipe(
@@ -113,7 +112,7 @@ pipe.save_ply(gaussians, "robot.ply")
113
 
114
  ### 4. Image-to-3D generation
115
 
116
- ```
117
  import numpy as np
118
  from PIL import Image
119
 
@@ -130,11 +129,9 @@ gaussians = pipe(
130
  pipe.save_ply(gaussians, "asset_from_image.ply")
131
  ```
132
 
133
- ---
134
-
135
  ## 📦 Repository Contents
136
 
137
- ```
138
  WasabiOctopus/LGM
139
  ├── README.md
140
  ├── model_index.json
@@ -150,84 +147,74 @@ WasabiOctopus/LGM
150
  └─�� lgm/
151
  ```
152
 
153
- ---
154
-
155
  ## 💡 Recommended Use Cases
156
 
157
  This model release is useful for:
158
 
159
- * Fast single-image-to-3D prototyping
160
- * Text-to-3D creative asset generation
161
- * 3D generation course projects
162
- * Research demos around 3D Gaussian Splatting
163
- * Benchmarking recent 3D asset generation pipelines
164
- * Building lightweight demos for Blender, Unity, or web-based 3D viewers
165
-
166
- ---
167
 
168
  ## ⚠️ Limitations
169
 
170
  This model is a research-oriented 3D generation pipeline. It may produce imperfect geometry or artifacts in the following cases:
171
 
172
- * Thin structures, transparent objects, wires, fur, or complex topology
173
- * Highly reflective or texture-heavy objects
174
- * Ambiguous single-view inputs where the back side is not visible
175
- * Prompt-only generation requiring precise physical dimensions
176
- * Production workflows requiring clean quad meshes, rigging, or CAD-level topology
177
 
178
  For professional 3D asset production, additional post-processing may be needed, such as mesh extraction, topology cleanup, UV unwrapping, material editing, or manual refinement.
179
 
180
- ---
181
-
182
  ## 🧪 Tips for Better Results
183
 
184
  Good prompts usually describe:
185
 
186
- **object category + style + material + lighting + geometry constraint**
 
 
187
 
188
  Examples:
189
 
190
- * `a cute robot, rounded toy design, smooth plastic material, studio lighting`
191
- * `a medieval treasure chest, golden metal details, wooden texture, clean geometry`
192
- * `a sci-fi helmet, hard-surface design, matte black material, sharp edges`
193
- * `a tiny house, stylized low-poly, warm colors, isometric game asset`
 
 
194
 
195
  For image-to-3D, use images with:
196
 
197
- * A single centered object
198
- * Clean background
199
- * Clear object silhouette
200
- * Minimal occlusion
201
- * Good lighting
202
-
203
- ---
204
 
205
  ## 🔗 Related Links
206
 
207
- * Original paper: https://arxiv.org/abs/2402.05054
208
- * Original project page: https://me.kiui.moe/lgm/
209
- * Original GitHub repository: https://github.com/3DTopia/LGM
210
- * Upstream Hugging Face model: https://huggingface.co/dylanebert/LGM-full
211
-
212
- ---
213
 
214
  ## 🙏 Acknowledgements
215
 
216
- This repository is based on the LGM ecosystem and the upstream Hugging Face full pipeline release.
217
-
218
- Full credit for the original LGM method goes to the authors of:
219
 
220
  **LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation**
221
 
222
  This release is intended as a convenient Hugging Face / Diffusers-compatible resource for research, education, and rapid experimentation.
223
 
224
- ---
225
-
226
  ## 📚 Citation
227
 
228
  If you use this model or the original LGM method, please cite:
229
 
230
- ```
231
  @article{tang2024lgm,
232
  title={LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation},
233
  author={Tang, Jiaxiang and Chen, Zhaoxi and Chen, Xiaokang and Wang, Tengfei and Zeng, Gang and Liu, Ziwei},
@@ -236,12 +223,10 @@ If you use this model or the original LGM method, please cite:
236
  }
237
  ```
238
 
239
- ---
240
-
241
  <div align="center">
242
 
243
  ### 🐙 Built for fast 3D generation experiments.
244
 
245
  **From prompt or image to 3D Gaussian assets — clean, simple, and research-friendly.**
246
 
247
- </div>
 
1
  ---
 
2
  license: mit
3
  pipeline_tag: image-to-3d
4
  library_name: diffusers
5
+ base_model: "dylanebert/LGM-full"
6
+ tags: ["image-to-3d", "text-to-3d", "3d-generation", "3d-gaussian-splatting", "gaussian-splatting", "multi-view-diffusion", "lgm", "diffusers", "safetensors", "objaverse", "research", "computer-graphics"]
7
+ arxiv: "2402.05054"
8
  ---
9
 
10
  <div align="center">
 
21
  <img src="https://img.shields.io/badge/License-MIT-green">
22
  </p>
23
 
24
+ **A Diffusers-ready LGM pipeline for fast 3D content creation from text or a single image.**
25
 
26
  </div>
27
 
 
 
28
  ## ✨ Highlights
29
 
30
+ - 🚀 **Fast 3D asset generation** powered by the LGM pipeline.
31
+ - 🧊 **3D Gaussian Splatting representation** for efficient high-resolution 3D content.
32
+ - 🖼️ **Text-to-3D and image-to-3D workflows** through multi-view diffusion.
33
+ - 🧩 **Diffusers-compatible model structure** with `LGMFullPipeline`.
34
+ - 🔬 Useful for **3D generation research, creative prototyping, course projects, and rapid experimentation**.
 
 
35
 
36
  ## 🖼️ Gallery
37
 
38
+ > Upload your own generated examples to an `assets/` folder and replace the placeholders below.
 
 
 
 
 
 
39
 
40
+ | Prompt / Input | Generated 3D Asset |
41
+ |---|---|
42
+ | `a cute robot, smooth toy material, studio lighting` | Coming soon |
43
+ | `a fantasy treasure chest with golden details` | Coming soon |
44
+ | `a stylized sci-fi helmet, clean hard-surface design` | Coming soon |
45
 
46
  ## 🧠 What is LGM?
47
 
48
+ **LGM**, short for **Large Multi-View Gaussian Model**, is a 3D generation framework designed for high-resolution 3D content creation.
49
 
50
+ Instead of directly generating a mesh from scratch, the pipeline first produces multi-view visual information and then reconstructs a 3D Gaussian representation. This makes it suitable for fast, feed-forward 3D asset generation from either a text prompt or a single input image.
51
 
52
  This repository provides a convenient Hugging Face / Diffusers-style release of the full LGM pipeline.
53
 
 
 
54
  ## 🏗️ Pipeline Overview
55
 
56
+ ```text
57
  Text prompt or single image
58
+
59
+ Multi-view diffusion generation
60
+
61
+ Multi-view Gaussian features
62
+
63
+ LGM reconstruction module
64
+
65
+ 3D Gaussian asset
66
+
67
+ PLY export / downstream rendering
68
+ ```
69
 
70
  ## 🚀 Quick Start
71
 
72
  ### 1. Install dependencies
73
 
74
+ ```bash
75
  pip install -U diffusers transformers accelerate safetensors
76
  pip install torch torchvision torchaudio
77
  pip install xformers trimesh kiui plyfile
78
  ```
79
 
80
+ For the full environment, check the repository `requirements.txt`.
81
+
82
  ### 2. Load the pipeline
83
 
84
+ ```python
85
  import torch
86
  from diffusers import DiffusionPipeline
87
 
 
98
 
99
  ### 3. Text-to-3D generation
100
 
101
+ ```python
102
  prompt = "a cute robot, smooth toy material, studio lighting, clean geometry"
103
 
104
  gaussians = pipe(
 
112
 
113
  ### 4. Image-to-3D generation
114
 
115
+ ```python
116
  import numpy as np
117
  from PIL import Image
118
 
 
129
  pipe.save_ply(gaussians, "asset_from_image.ply")
130
  ```
131
 
 
 
132
  ## 📦 Repository Contents
133
 
134
+ ```text
135
  WasabiOctopus/LGM
136
  ├── README.md
137
  ├── model_index.json
 
147
  └─�� lgm/
148
  ```
149
 
 
 
150
  ## 💡 Recommended Use Cases
151
 
152
  This model release is useful for:
153
 
154
+ - Fast **single-image-to-3D** prototyping
155
+ - **Text-to-3D** creative asset generation
156
+ - 3D generation course projects
157
+ - Research demos around 3D Gaussian Splatting
158
+ - Benchmarking recent 3D asset generation pipelines
159
+ - Building lightweight demos for Blender, Unity, or web-based 3D viewers
 
 
160
 
161
  ## ⚠️ Limitations
162
 
163
  This model is a research-oriented 3D generation pipeline. It may produce imperfect geometry or artifacts in the following cases:
164
 
165
+ - Thin structures, transparent objects, wires, fur, or complex topology
166
+ - Highly reflective or texture-heavy objects
167
+ - Ambiguous single-view inputs where the back side is not visible
168
+ - Prompt-only generation requiring precise physical dimensions
169
+ - Production workflows requiring clean quad meshes, rigging, or CAD-level topology
170
 
171
  For professional 3D asset production, additional post-processing may be needed, such as mesh extraction, topology cleanup, UV unwrapping, material editing, or manual refinement.
172
 
 
 
173
  ## 🧪 Tips for Better Results
174
 
175
  Good prompts usually describe:
176
 
177
+ ```text
178
+ object category + style + material + lighting + geometry constraint
179
+ ```
180
 
181
  Examples:
182
 
183
+ ```text
184
+ a cute robot, rounded toy design, smooth plastic material, studio lighting
185
+ a medieval treasure chest, golden metal details, wooden texture, clean geometry
186
+ a sci-fi helmet, hard-surface design, matte black material, sharp edges
187
+ a tiny house, stylized low-poly, warm colors, isometric game asset
188
+ ```
189
 
190
  For image-to-3D, use images with:
191
 
192
+ - A single centered object
193
+ - Clean background
194
+ - Clear object silhouette
195
+ - Minimal occlusion
196
+ - Good lighting
 
 
197
 
198
  ## 🔗 Related Links
199
 
200
+ - Original paper: https://arxiv.org/abs/2402.05054
201
+ - Original project page: https://me.kiui.moe/lgm/
202
+ - Original GitHub repository: https://github.com/3DTopia/LGM
203
+ - Upstream Hugging Face model: https://huggingface.co/dylanebert/LGM-full
 
 
204
 
205
  ## 🙏 Acknowledgements
206
 
207
+ This repository is based on the LGM ecosystem and the upstream Hugging Face full pipeline release. Full credit for the original LGM method goes to the authors of:
 
 
208
 
209
  **LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation**
210
 
211
  This release is intended as a convenient Hugging Face / Diffusers-compatible resource for research, education, and rapid experimentation.
212
 
 
 
213
  ## 📚 Citation
214
 
215
  If you use this model or the original LGM method, please cite:
216
 
217
+ ```bibtex
218
  @article{tang2024lgm,
219
  title={LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation},
220
  author={Tang, Jiaxiang and Chen, Zhaoxi and Chen, Xiaokang and Wang, Tengfei and Zeng, Gang and Liu, Ziwei},
 
223
  }
224
  ```
225
 
 
 
226
  <div align="center">
227
 
228
  ### 🐙 Built for fast 3D generation experiments.
229
 
230
  **From prompt or image to 3D Gaussian assets — clean, simple, and research-friendly.**
231
 
232
+ </div>