WasabiOctopus commited on
Commit
d7af4f8
·
verified ·
1 Parent(s): 52e2add

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +243 -6
README.md CHANGED
@@ -1,12 +1,249 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- license: mit
3
- pipeline_tag: image-to-3d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  ---
5
 
6
- # LGM Full
7
 
8
- This custom pipeline encapsulates the full [LGM](https://huggingface.co/ashawkey/LGM) pipeline, including [multi-view diffusion](https://huggingface.co/ashawkey/imagedream-ipmv-diffusers).
9
 
10
- It is provided as a resource for the [ML for 3D Course](https://huggingface.co/learn/ml-for-3d-course).
11
 
12
- Original LGM paper: [LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation](https://huggingface.co/papers/2402.05054).
 
1
+ <div align="center">
2
+
3
+ # 🐙 WasabiOctopus / LGM
4
+
5
+ ### Large Multi-View Gaussian Model for Fast 3D Asset Generation
6
+
7
+ <p>
8
+ <img src="https://img.shields.io/badge/Task-Image--to--3D-blueviolet">
9
+ <img src="https://img.shields.io/badge/Task-Text--to--3D-8A2BE2">
10
+ <img src="https://img.shields.io/badge/Representation-3D%20Gaussian%20Splatting-orange">
11
+ <img src="https://img.shields.io/badge/Library-Diffusers-yellow">
12
+ <img src="https://img.shields.io/badge/License-MIT-green">
13
+ </p>
14
+
15
+ **A Diffusers-ready LGM pipeline for fast 3D content creation from text or a single image.**
16
+
17
+ </div>
18
+
19
+ ---
20
+
21
+ ## ✨ Highlights
22
+
23
+ * 🚀 **Fast 3D asset generation** powered by the LGM pipeline.
24
+ * 🧊 **3D Gaussian Splatting representation** for efficient high-resolution 3D content.
25
+ * 🖼️ **Text-to-3D and image-to-3D workflows** through multi-view diffusion.
26
+ * 🧩 **Diffusers-compatible model structure** with `LGMFullPipeline`.
27
+ * 🔬 Useful for **3D generation research, creative prototyping, course projects, and rapid experimentation**.
28
+
29
+ ---
30
+
31
+ ## 🖼️ Gallery
32
+
33
+ > Upload your own generated examples to an `assets/` folder and replace the placeholders below.
34
+
35
+ | Prompt / Input | Generated 3D Asset |
36
+ | ----------------------------------------------------- | ------------------ |
37
+ | `a cute robot, smooth toy material, studio lighting` | Coming soon |
38
+ | `a fantasy treasure chest with golden details` | Coming soon |
39
+ | `a stylized sci-fi helmet, clean hard-surface design` | Coming soon |
40
+
41
+ ---
42
+
43
+ ## 🧠 What is LGM?
44
+
45
+ **LGM**, short for **Large Multi-View Gaussian Model**, is a 3D generation framework designed for high-resolution 3D content creation.
46
+
47
+ Instead of directly generating a mesh from scratch, the pipeline first produces multi-view visual information and then reconstructs a 3D Gaussian representation. This makes it suitable for fast, feed-forward 3D asset generation from either a text prompt or a single input image.
48
+
49
+ This repository provides a convenient Hugging Face / Diffusers-style release of the full LGM pipeline.
50
+
51
  ---
52
+
53
+ ## 🏗️ Pipeline Overview
54
+
55
+ ```text
56
+ Text prompt or single image
57
+
58
+ Multi-view diffusion generation
59
+
60
+ Multi-view Gaussian features
61
+
62
+ LGM reconstruction module
63
+
64
+ 3D Gaussian asset
65
+
66
+ PLY export / downstream rendering
67
+ ```
68
+
69
+ ---
70
+
71
+ ## 🚀 Quick Start
72
+
73
+ ### 1. Install dependencies
74
+
75
+ ```bash
76
+ pip install -U diffusers transformers accelerate safetensors
77
+ pip install torch torchvision torchaudio
78
+ pip install xformers trimesh kiui plyfile
79
+ ```
80
+
81
+ For the full environment, check the repository `requirements.txt`.
82
+
83
+ ### 2. Load the pipeline
84
+
85
+ ```python
86
+ import torch
87
+ from diffusers import DiffusionPipeline
88
+
89
+ repo_id = "WasabiOctopus/LGM"
90
+
91
+ pipe = DiffusionPipeline.from_pretrained(
92
+ repo_id,
93
+ torch_dtype=torch.float16,
94
+ trust_remote_code=True,
95
+ )
96
+
97
+ pipe = pipe.to("cuda")
98
+ ```
99
+
100
+ ### 3. Text-to-3D generation
101
+
102
+ ```python
103
+ prompt = "a cute robot, smooth toy material, studio lighting, clean geometry"
104
+
105
+ gaussians = pipe(
106
+ prompt=prompt,
107
+ num_inference_steps=50,
108
+ guidance_scale=7.0,
109
+ )
110
+
111
+ pipe.save_ply(gaussians, "robot.ply")
112
+ ```
113
+
114
+ ### 4. Image-to-3D generation
115
+
116
+ ```python
117
+ import numpy as np
118
+ from PIL import Image
119
+
120
+ image = Image.open("input.png").convert("RGB").resize((256, 256))
121
+ image = np.array(image).astype(np.float32) / 255.0
122
+
123
+ gaussians = pipe(
124
+ prompt="",
125
+ image=image,
126
+ num_inference_steps=50,
127
+ guidance_scale=7.0,
128
+ )
129
+
130
+ pipe.save_ply(gaussians, "asset_from_image.ply")
131
+ ```
132
+
133
+ ---
134
+
135
+ ## 📦 Repository Contents
136
+
137
+ ```text
138
+ WasabiOctopus/LGM
139
+ ├── README.md
140
+ ├── model_index.json
141
+ ├── pipeline.py
142
+ ├── requirements.txt
143
+ ├── feature_extractor/
144
+ ├── image_encoder/
145
+ ├── text_encoder/
146
+ ├── tokenizer/
147
+ ├── scheduler/
148
+ ├── vae/
149
+ ├── unet/
150
+ └── lgm/
151
+ ```
152
+
153
+ ---
154
+
155
+ ## 💡 Recommended Use Cases
156
+
157
+ This model release is useful for:
158
+
159
+ * Fast **single-image-to-3D** prototyping
160
+ * **Text-to-3D** creative asset generation
161
+ * 3D generation course projects
162
+ * Research demos around 3D Gaussian Splatting
163
+ * Benchmarking recent 3D asset generation pipelines
164
+ * Building lightweight demos for Blender, Unity, or web-based 3D viewers
165
+
166
+ ---
167
+
168
+ ## ⚠️ Limitations
169
+
170
+ This model is a research-oriented 3D generation pipeline. It may produce imperfect geometry or artifacts in the following cases:
171
+
172
+ * Thin structures, transparent objects, wires, fur, or complex topology
173
+ * Highly reflective or texture-heavy objects
174
+ * Ambiguous single-view inputs where the back side is not visible
175
+ * Prompt-only generation requiring precise physical dimensions
176
+ * Production workflows requiring clean quad meshes, rigging, or CAD-level topology
177
+
178
+ For professional 3D asset production, additional post-processing may be needed, such as mesh extraction, topology cleanup, UV unwrapping, material editing, or manual refinement.
179
+
180
+ ---
181
+
182
+ ## 🧪 Tips for Better Results
183
+
184
+ Good prompts usually describe:
185
+
186
+ ```text
187
+ object category + style + material + lighting + geometry constraint
188
+ ```
189
+
190
+ Examples:
191
+
192
+ ```text
193
+ a cute robot, rounded toy design, smooth plastic material, studio lighting
194
+ a medieval treasure chest, golden metal details, wooden texture, clean geometry
195
+ a sci-fi helmet, hard-surface design, matte black material, sharp edges
196
+ a tiny house, stylized low-poly, warm colors, isometric game asset
197
+ ```
198
+
199
+ For image-to-3D, use images with:
200
+
201
+ * A single centered object
202
+ * Clean background
203
+ * Clear object silhouette
204
+ * Minimal occlusion
205
+ * Good lighting
206
+
207
+ ---
208
+
209
+ ## 🔗 Related Links
210
+
211
+ * Original paper: https://arxiv.org/abs/2402.05054
212
+ * Original project page: https://me.kiui.moe/lgm/
213
+ * Original GitHub repository: https://github.com/3DTopia/LGM
214
+ * Upstream Hugging Face model: https://huggingface.co/dylanebert/LGM-full
215
+
216
+ ---
217
+
218
+ ## 🙏 Acknowledgements
219
+
220
+ This repository is based on the LGM ecosystem and the upstream Hugging Face full pipeline release. Full credit for the original LGM method goes to the authors of:
221
+
222
+ **LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation**
223
+
224
+ This release is intended as a convenient Hugging Face / Diffusers-compatible resource for research, education, and rapid experimentation.
225
+
226
+ ---
227
+
228
+ ## 📚 Citation
229
+
230
+ If you use this model or the original LGM method, please cite:
231
+
232
+ ```bibtex
233
+ @article{tang2024lgm,
234
+ title={LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation},
235
+ author={Tang, Jiaxiang and Chen, Zhaoxi and Chen, Xiaokang and Wang, Tengfei and Zeng, Gang and Liu, Ziwei},
236
+ journal={arXiv preprint arXiv:2402.05054},
237
+ year={2024}
238
+ }
239
+ ```
240
+
241
  ---
242
 
243
+ <div align="center">
244
 
245
+ ### 🐙 Built for fast 3D generation experiments.
246
 
247
+ **From prompt or image to 3D Gaussian assets clean, simple, and research-friendly.**
248
 
249
+ </div>