---
license: mit
pipeline_tag: image-to-3d
library_name: diffusers
base_model: "dylanebert/LGM-full"
tags: ["image-to-3d", "text-to-3d", "3d-generation", "3d-gaussian-splatting", "gaussian-splatting", "multi-view-diffusion", "lgm", "diffusers", "safetensors", "objaverse", "research", "computer-graphics"]
arxiv: "2402.05054"
---
## โจ Highlights
- ๐ **Fast 3D asset generation** powered by the LGM pipeline.
- ๐ง **3D Gaussian Splatting representation** for efficient high-resolution 3D content.
- ๐ผ๏ธ **Text-to-3D and image-to-3D workflows** through multi-view diffusion.
- ๐งฉ **Diffusers-compatible model structure** with `LGMFullPipeline`.
- ๐ฌ Useful for **3D generation research, creative prototyping, course projects, and rapid experimentation**.
## ๐ผ๏ธ Gallery
> Upload your own generated examples to an `assets/` folder and replace the placeholders below.
| Prompt / Input | Generated 3D Asset |
|---|---|
| `a cute robot, smooth toy material, studio lighting` | Coming soon |
| `a fantasy treasure chest with golden details` | Coming soon |
| `a stylized sci-fi helmet, clean hard-surface design` | Coming soon |
## ๐ง What is LGM?
**LGM**, short for **Large Multi-View Gaussian Model**, is a 3D generation framework designed for high-resolution 3D content creation.
Instead of directly generating a mesh from scratch, the pipeline first produces multi-view visual information and then reconstructs a 3D Gaussian representation. This makes it suitable for fast, feed-forward 3D asset generation from either a text prompt or a single input image.
This repository provides a convenient Hugging Face / Diffusers-style release of the full LGM pipeline.
## ๐๏ธ Pipeline Overview
```text
Text prompt or single image
โ
Multi-view diffusion generation
โ
Multi-view Gaussian features
โ
LGM reconstruction module
โ
3D Gaussian asset
โ
PLY export / downstream rendering
```
## ๐ Quick Start
### 1. Install dependencies
```bash
pip install -U diffusers transformers accelerate safetensors
pip install torch torchvision torchaudio
pip install xformers trimesh kiui plyfile
```
For the full environment, check the repository `requirements.txt`.
### 2. Load the pipeline
```python
import torch
from diffusers import DiffusionPipeline
repo_id = "WasabiOctopus/LGM"
pipe = DiffusionPipeline.from_pretrained(
repo_id,
torch_dtype=torch.float16,
trust_remote_code=True,
)
pipe = pipe.to("cuda")
```
### 3. Text-to-3D generation
```python
prompt = "a cute robot, smooth toy material, studio lighting, clean geometry"
gaussians = pipe(
prompt=prompt,
num_inference_steps=50,
guidance_scale=7.0,
)
pipe.save_ply(gaussians, "robot.ply")
```
### 4. Image-to-3D generation
```python
import numpy as np
from PIL import Image
image = Image.open("input.png").convert("RGB").resize((256, 256))
image = np.array(image).astype(np.float32) / 255.0
gaussians = pipe(
prompt="",
image=image,
num_inference_steps=50,
guidance_scale=7.0,
)
pipe.save_ply(gaussians, "asset_from_image.ply")
```
## ๐ฆ Repository Contents
```text
WasabiOctopus/LGM
โโโ README.md
โโโ model_index.json
โโโ pipeline.py
โโโ requirements.txt
โโโ feature_extractor/
โโโ image_encoder/
โโโ text_encoder/
โโโ tokenizer/
โโโ scheduler/
โโโ vae/
โโโ unet/
โโโ lgm/
```
## ๐ก Recommended Use Cases
This model release is useful for:
- Fast **single-image-to-3D** prototyping
- **Text-to-3D** creative asset generation
- 3D generation course projects
- Research demos around 3D Gaussian Splatting
- Benchmarking recent 3D asset generation pipelines
- Building lightweight demos for Blender, Unity, or web-based 3D viewers
## โ ๏ธ Limitations
This model is a research-oriented 3D generation pipeline. It may produce imperfect geometry or artifacts in the following cases:
- Thin structures, transparent objects, wires, fur, or complex topology
- Highly reflective or texture-heavy objects
- Ambiguous single-view inputs where the back side is not visible
- Prompt-only generation requiring precise physical dimensions
- Production workflows requiring clean quad meshes, rigging, or CAD-level topology
For professional 3D asset production, additional post-processing may be needed, such as mesh extraction, topology cleanup, UV unwrapping, material editing, or manual refinement.
## ๐งช Tips for Better Results
Good prompts usually describe:
```text
object category + style + material + lighting + geometry constraint
```
Examples:
```text
a cute robot, rounded toy design, smooth plastic material, studio lighting
a medieval treasure chest, golden metal details, wooden texture, clean geometry
a sci-fi helmet, hard-surface design, matte black material, sharp edges
a tiny house, stylized low-poly, warm colors, isometric game asset
```
For image-to-3D, use images with:
- A single centered object
- Clean background
- Clear object silhouette
- Minimal occlusion
- Good lighting
## ๐ Related Links
- Original paper: https://arxiv.org/abs/2402.05054
- Original project page: https://me.kiui.moe/lgm/
- Original GitHub repository: https://github.com/3DTopia/LGM
- Upstream Hugging Face model: https://huggingface.co/dylanebert/LGM-full
## ๐ Acknowledgements
This repository is based on the LGM ecosystem and the upstream Hugging Face full pipeline release. Full credit for the original LGM method goes to the authors of:
**LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation**
This release is intended as a convenient Hugging Face / Diffusers-compatible resource for research, education, and rapid experimentation.
### ๐ Built for fast 3D generation experiments.
**From prompt or image to 3D Gaussian assets โ clean, simple, and research-friendly.**