Image-to-3D
Diffusers
Safetensors
LGMFullPipeline
text-to-3d
3d-generation
3d-gaussian-splatting
gaussian-splatting
multi-view-diffusion
lgm
objaverse
research
computer-graphics
Instructions to use WasabiOctopus/LGM with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use WasabiOctopus/LGM with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("WasabiOctopus/LGM", dtype=torch.bfloat16, device_map="cuda") prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k" image = pipe(prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| license: mit | |
| pipeline_tag: image-to-3d | |
| library_name: diffusers | |
| base_model: "dylanebert/LGM-full" | |
| tags: ["image-to-3d", "text-to-3d", "3d-generation", "3d-gaussian-splatting", "gaussian-splatting", "multi-view-diffusion", "lgm", "diffusers", "safetensors", "objaverse", "research", "computer-graphics"] | |
| arxiv: "2402.05054" | |
| <div align="center"> | |
| # 🐙 WasabiOctopus / LGM | |
| ### Large Multi-View Gaussian Model for Fast 3D Asset Generation | |
| <p> | |
| <img src="https://img.shields.io/badge/Task-Image--to--3D-blueviolet"> | |
| <img src="https://img.shields.io/badge/Task-Text--to--3D-8A2BE2"> | |
| <img src="https://img.shields.io/badge/Representation-3D%20Gaussian%20Splatting-orange"> | |
| <img src="https://img.shields.io/badge/Library-Diffusers-yellow"> | |
| <img src="https://img.shields.io/badge/License-MIT-green"> | |
| </p> | |
| **A Diffusers-ready LGM pipeline for fast 3D content creation from text or a single image.** | |
| </div> | |
| ## ✨ Highlights | |
| - 🚀 **Fast 3D asset generation** powered by the LGM pipeline. | |
| - 🧊 **3D Gaussian Splatting representation** for efficient high-resolution 3D content. | |
| - 🖼️ **Text-to-3D and image-to-3D workflows** through multi-view diffusion. | |
| - 🧩 **Diffusers-compatible model structure** with `LGMFullPipeline`. | |
| - 🔬 Useful for **3D generation research, creative prototyping, course projects, and rapid experimentation**. | |
| ## 🖼️ Gallery | |
| > Upload your own generated examples to an `assets/` folder and replace the placeholders below. | |
| | Prompt / Input | Generated 3D Asset | | |
| |---|---| | |
| | `a cute robot, smooth toy material, studio lighting` | Coming soon | | |
| | `a fantasy treasure chest with golden details` | Coming soon | | |
| | `a stylized sci-fi helmet, clean hard-surface design` | Coming soon | | |
| ## 🧠 What is LGM? | |
| **LGM**, short for **Large Multi-View Gaussian Model**, is a 3D generation framework designed for high-resolution 3D content creation. | |
| Instead of directly generating a mesh from scratch, the pipeline first produces multi-view visual information and then reconstructs a 3D Gaussian representation. This makes it suitable for fast, feed-forward 3D asset generation from either a text prompt or a single input image. | |
| This repository provides a convenient Hugging Face / Diffusers-style release of the full LGM pipeline. | |
| ## 🏗️ Pipeline Overview | |
| ```text | |
| Text prompt or single image | |
| ↓ | |
| Multi-view diffusion generation | |
| ↓ | |
| Multi-view Gaussian features | |
| ↓ | |
| LGM reconstruction module | |
| ↓ | |
| 3D Gaussian asset | |
| ↓ | |
| PLY export / downstream rendering | |
| ``` | |
| ## 🚀 Quick Start | |
| ### 1. Install dependencies | |
| ```bash | |
| pip install -U diffusers transformers accelerate safetensors | |
| pip install torch torchvision torchaudio | |
| pip install xformers trimesh kiui plyfile | |
| ``` | |
| For the full environment, check the repository `requirements.txt`. | |
| ### 2. Load the pipeline | |
| ```python | |
| import torch | |
| from diffusers import DiffusionPipeline | |
| repo_id = "WasabiOctopus/LGM" | |
| pipe = DiffusionPipeline.from_pretrained( | |
| repo_id, | |
| torch_dtype=torch.float16, | |
| trust_remote_code=True, | |
| ) | |
| pipe = pipe.to("cuda") | |
| ``` | |
| ### 3. Text-to-3D generation | |
| ```python | |
| prompt = "a cute robot, smooth toy material, studio lighting, clean geometry" | |
| gaussians = pipe( | |
| prompt=prompt, | |
| num_inference_steps=50, | |
| guidance_scale=7.0, | |
| ) | |
| pipe.save_ply(gaussians, "robot.ply") | |
| ``` | |
| ### 4. Image-to-3D generation | |
| ```python | |
| import numpy as np | |
| from PIL import Image | |
| image = Image.open("input.png").convert("RGB").resize((256, 256)) | |
| image = np.array(image).astype(np.float32) / 255.0 | |
| gaussians = pipe( | |
| prompt="", | |
| image=image, | |
| num_inference_steps=50, | |
| guidance_scale=7.0, | |
| ) | |
| pipe.save_ply(gaussians, "asset_from_image.ply") | |
| ``` | |
| ## 📦 Repository Contents | |
| ```text | |
| WasabiOctopus/LGM | |
| ├── README.md | |
| ├── model_index.json | |
| ├── pipeline.py | |
| ├── requirements.txt | |
| ├── feature_extractor/ | |
| ├── image_encoder/ | |
| ├── text_encoder/ | |
| ├── tokenizer/ | |
| ├── scheduler/ | |
| ├── vae/ | |
| ├── unet/ | |
| └── lgm/ | |
| ``` | |
| ## 💡 Recommended Use Cases | |
| This model release is useful for: | |
| - Fast **single-image-to-3D** prototyping | |
| - **Text-to-3D** creative asset generation | |
| - 3D generation course projects | |
| - Research demos around 3D Gaussian Splatting | |
| - Benchmarking recent 3D asset generation pipelines | |
| - Building lightweight demos for Blender, Unity, or web-based 3D viewers | |
| ## ⚠️ Limitations | |
| This model is a research-oriented 3D generation pipeline. It may produce imperfect geometry or artifacts in the following cases: | |
| - Thin structures, transparent objects, wires, fur, or complex topology | |
| - Highly reflective or texture-heavy objects | |
| - Ambiguous single-view inputs where the back side is not visible | |
| - Prompt-only generation requiring precise physical dimensions | |
| - Production workflows requiring clean quad meshes, rigging, or CAD-level topology | |
| For professional 3D asset production, additional post-processing may be needed, such as mesh extraction, topology cleanup, UV unwrapping, material editing, or manual refinement. | |
| ## 🧪 Tips for Better Results | |
| Good prompts usually describe: | |
| ```text | |
| object category + style + material + lighting + geometry constraint | |
| ``` | |
| Examples: | |
| ```text | |
| a cute robot, rounded toy design, smooth plastic material, studio lighting | |
| a medieval treasure chest, golden metal details, wooden texture, clean geometry | |
| a sci-fi helmet, hard-surface design, matte black material, sharp edges | |
| a tiny house, stylized low-poly, warm colors, isometric game asset | |
| ``` | |
| For image-to-3D, use images with: | |
| - A single centered object | |
| - Clean background | |
| - Clear object silhouette | |
| - Minimal occlusion | |
| - Good lighting | |
| ## 🔗 Related Links | |
| - Original paper: https://arxiv.org/abs/2402.05054 | |
| - Original project page: https://me.kiui.moe/lgm/ | |
| - Original GitHub repository: https://github.com/3DTopia/LGM | |
| - Upstream Hugging Face model: https://huggingface.co/dylanebert/LGM-full | |
| ## 🙏 Acknowledgements | |
| This repository is based on the LGM ecosystem and the upstream Hugging Face full pipeline release. Full credit for the original LGM method goes to the authors of: | |
| **LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation** | |
| This release is intended as a convenient Hugging Face / Diffusers-compatible resource for research, education, and rapid experimentation. | |
| <div align="center"> | |
| ### 🐙 Built for fast 3D generation experiments. | |
| **From prompt or image to 3D Gaussian assets — clean, simple, and research-friendly.** | |
| </div> |