Image-to-3D
Diffusers
Safetensors
LGMFullPipeline
text-to-3d
3d-generation
3d-gaussian-splatting
gaussian-splatting
multi-view-diffusion
lgm
objaverse
research
computer-graphics
Instructions to use WasabiOctopus/LGM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use WasabiOctopus/LGM with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("WasabiOctopus/LGM", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
File size: 6,711 Bytes
eea05f1 76bfabb eea05f1 d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 76bfabb d7af4f8 52e2add d7af4f8 52e2add d7af4f8 52e2add 76bfabb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 | ---
license: mit
pipeline_tag: image-to-3d
library_name: diffusers
base_model: "dylanebert/LGM-full"
tags: ["image-to-3d", "text-to-3d", "3d-generation", "3d-gaussian-splatting", "gaussian-splatting", "multi-view-diffusion", "lgm", "diffusers", "safetensors", "objaverse", "research", "computer-graphics"]
arxiv: "2402.05054"
---
<div align="center">
# 🐙 WasabiOctopus / LGM
### Large Multi-View Gaussian Model for Fast 3D Asset Generation
<p>
<img src="https://img.shields.io/badge/Task-Image--to--3D-blueviolet">
<img src="https://img.shields.io/badge/Task-Text--to--3D-8A2BE2">
<img src="https://img.shields.io/badge/Representation-3D%20Gaussian%20Splatting-orange">
<img src="https://img.shields.io/badge/Library-Diffusers-yellow">
<img src="https://img.shields.io/badge/License-MIT-green">
</p>
**A Diffusers-ready LGM pipeline for fast 3D content creation from text or a single image.**
</div>
## ✨ Highlights
- 🚀 **Fast 3D asset generation** powered by the LGM pipeline.
- 🧊 **3D Gaussian Splatting representation** for efficient high-resolution 3D content.
- 🖼️ **Text-to-3D and image-to-3D workflows** through multi-view diffusion.
- 🧩 **Diffusers-compatible model structure** with `LGMFullPipeline`.
- 🔬 Useful for **3D generation research, creative prototyping, course projects, and rapid experimentation**.
## 🖼️ Gallery
> Upload your own generated examples to an `assets/` folder and replace the placeholders below.
| Prompt / Input | Generated 3D Asset |
|---|---|
| `a cute robot, smooth toy material, studio lighting` | Coming soon |
| `a fantasy treasure chest with golden details` | Coming soon |
| `a stylized sci-fi helmet, clean hard-surface design` | Coming soon |
## 🧠 What is LGM?
**LGM**, short for **Large Multi-View Gaussian Model**, is a 3D generation framework designed for high-resolution 3D content creation.
Instead of directly generating a mesh from scratch, the pipeline first produces multi-view visual information and then reconstructs a 3D Gaussian representation. This makes it suitable for fast, feed-forward 3D asset generation from either a text prompt or a single input image.
This repository provides a convenient Hugging Face / Diffusers-style release of the full LGM pipeline.
## 🏗️ Pipeline Overview
```text
Text prompt or single image
↓
Multi-view diffusion generation
↓
Multi-view Gaussian features
↓
LGM reconstruction module
↓
3D Gaussian asset
↓
PLY export / downstream rendering
```
## 🚀 Quick Start
### 1. Install dependencies
```bash
pip install -U diffusers transformers accelerate safetensors
pip install torch torchvision torchaudio
pip install xformers trimesh kiui plyfile
```
For the full environment, check the repository `requirements.txt`.
### 2. Load the pipeline
```python
import torch
from diffusers import DiffusionPipeline
repo_id = "WasabiOctopus/LGM"
pipe = DiffusionPipeline.from_pretrained(
repo_id,
torch_dtype=torch.float16,
trust_remote_code=True,
)
pipe = pipe.to("cuda")
```
### 3. Text-to-3D generation
```python
prompt = "a cute robot, smooth toy material, studio lighting, clean geometry"
gaussians = pipe(
prompt=prompt,
num_inference_steps=50,
guidance_scale=7.0,
)
pipe.save_ply(gaussians, "robot.ply")
```
### 4. Image-to-3D generation
```python
import numpy as np
from PIL import Image
image = Image.open("input.png").convert("RGB").resize((256, 256))
image = np.array(image).astype(np.float32) / 255.0
gaussians = pipe(
prompt="",
image=image,
num_inference_steps=50,
guidance_scale=7.0,
)
pipe.save_ply(gaussians, "asset_from_image.ply")
```
## 📦 Repository Contents
```text
WasabiOctopus/LGM
├── README.md
├── model_index.json
├── pipeline.py
├── requirements.txt
├── feature_extractor/
├── image_encoder/
├── text_encoder/
├── tokenizer/
├── scheduler/
├── vae/
├── unet/
└── lgm/
```
## 💡 Recommended Use Cases
This model release is useful for:
- Fast **single-image-to-3D** prototyping
- **Text-to-3D** creative asset generation
- 3D generation course projects
- Research demos around 3D Gaussian Splatting
- Benchmarking recent 3D asset generation pipelines
- Building lightweight demos for Blender, Unity, or web-based 3D viewers
## ⚠️ Limitations
This model is a research-oriented 3D generation pipeline. It may produce imperfect geometry or artifacts in the following cases:
- Thin structures, transparent objects, wires, fur, or complex topology
- Highly reflective or texture-heavy objects
- Ambiguous single-view inputs where the back side is not visible
- Prompt-only generation requiring precise physical dimensions
- Production workflows requiring clean quad meshes, rigging, or CAD-level topology
For professional 3D asset production, additional post-processing may be needed, such as mesh extraction, topology cleanup, UV unwrapping, material editing, or manual refinement.
## 🧪 Tips for Better Results
Good prompts usually describe:
```text
object category + style + material + lighting + geometry constraint
```
Examples:
```text
a cute robot, rounded toy design, smooth plastic material, studio lighting
a medieval treasure chest, golden metal details, wooden texture, clean geometry
a sci-fi helmet, hard-surface design, matte black material, sharp edges
a tiny house, stylized low-poly, warm colors, isometric game asset
```
For image-to-3D, use images with:
- A single centered object
- Clean background
- Clear object silhouette
- Minimal occlusion
- Good lighting
## 🔗 Related Links
- Original paper: https://arxiv.org/abs/2402.05054
- Original project page: https://me.kiui.moe/lgm/
- Original GitHub repository: https://github.com/3DTopia/LGM
- Upstream Hugging Face model: https://huggingface.co/dylanebert/LGM-full
## 🙏 Acknowledgements
This repository is based on the LGM ecosystem and the upstream Hugging Face full pipeline release. Full credit for the original LGM method goes to the authors of:
**LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation**
This release is intended as a convenient Hugging Face / Diffusers-compatible resource for research, education, and rapid experimentation.
<div align="center">
### 🐙 Built for fast 3D generation experiments.
**From prompt or image to 3D Gaussian assets — clean, simple, and research-friendly.**
</div> |