---
license: mit
pipeline_tag: image-to-3d
library_name: diffusers
base_model: "dylanebert/LGM-full"
tags: ["image-to-3d", "text-to-3d", "3d-generation", "3d-gaussian-splatting", "gaussian-splatting", "multi-view-diffusion", "lgm", "diffusers", "safetensors", "objaverse", "research", "computer-graphics"]
arxiv: "2402.05054"
---

<div align="center">

# 🐙 WasabiOctopus / LGM

### Large Multi-View Gaussian Model for Fast 3D Asset Generation

<p>
  <img src="https://img.shields.io/badge/Task-Image--to--3D-blueviolet">
  <img src="https://img.shields.io/badge/Task-Text--to--3D-8A2BE2">
  <img src="https://img.shields.io/badge/Representation-3D%20Gaussian%20Splatting-orange">
  <img src="https://img.shields.io/badge/Library-Diffusers-yellow">
  <img src="https://img.shields.io/badge/License-MIT-green">
</p>

**A Diffusers-ready LGM pipeline for fast 3D content creation from text or a single image.**

</div>

## ✨ Highlights

- 🚀 **Fast 3D asset generation** powered by the LGM pipeline.
- 🧊 **3D Gaussian Splatting representation** for efficient high-resolution 3D content.
- 🖼️ **Text-to-3D and image-to-3D workflows** through multi-view diffusion.
- 🧩 **Diffusers-compatible model structure** with `LGMFullPipeline`.
- 🔬 Useful for **3D generation research, creative prototyping, course projects, and rapid experimentation**.

## 🖼️ Gallery

> Upload your own generated examples to an `assets/` folder and replace the placeholders below.

| Prompt / Input | Generated 3D Asset |
|---|---|
| `a cute robot, smooth toy material, studio lighting` | Coming soon |
| `a fantasy treasure chest with golden details` | Coming soon |
| `a stylized sci-fi helmet, clean hard-surface design` | Coming soon |

## 🧠 What is LGM?

**LGM**, short for **Large Multi-View Gaussian Model**, is a 3D generation framework designed for high-resolution 3D content creation.

Instead of directly generating a mesh from scratch, the pipeline first produces multi-view visual information and then reconstructs a 3D Gaussian representation. This makes it suitable for fast, feed-forward 3D asset generation from either a text prompt or a single input image.

This repository provides a convenient Hugging Face / Diffusers-style release of the full LGM pipeline.

## 🏗️ Pipeline Overview

```text
Text prompt or single image
        ↓
Multi-view diffusion generation
        ↓
Multi-view Gaussian features
        ↓
LGM reconstruction module
        ↓
3D Gaussian asset
        ↓
PLY export / downstream rendering
```

## 🚀 Quick Start

### 1. Install dependencies

```bash
pip install -U diffusers transformers accelerate safetensors
pip install torch torchvision torchaudio
pip install xformers trimesh kiui plyfile
```

For the full environment, check the repository `requirements.txt`.

### 2. Load the pipeline

```python
import torch
from diffusers import DiffusionPipeline

repo_id = "WasabiOctopus/LGM"

pipe = DiffusionPipeline.from_pretrained(
    repo_id,
    torch_dtype=torch.float16,
    trust_remote_code=True,
)

pipe = pipe.to("cuda")
```

### 3. Text-to-3D generation

```python
prompt = "a cute robot, smooth toy material, studio lighting, clean geometry"

gaussians = pipe(
    prompt=prompt,
    num_inference_steps=50,
    guidance_scale=7.0,
)

pipe.save_ply(gaussians, "robot.ply")
```

### 4. Image-to-3D generation

```python
import numpy as np
from PIL import Image

image = Image.open("input.png").convert("RGB").resize((256, 256))
image = np.array(image).astype(np.float32) / 255.0

gaussians = pipe(
    prompt="",
    image=image,
    num_inference_steps=50,
    guidance_scale=7.0,
)

pipe.save_ply(gaussians, "asset_from_image.ply")
```

## 📦 Repository Contents

```text
WasabiOctopus/LGM
├── README.md
├── model_index.json
├── pipeline.py
├── requirements.txt
├── feature_extractor/
├── image_encoder/
├── text_encoder/
├── tokenizer/
├── scheduler/
├── vae/
├── unet/
└── lgm/
```

## 💡 Recommended Use Cases

This model release is useful for:

- Fast **single-image-to-3D** prototyping
- **Text-to-3D** creative asset generation
- 3D generation course projects
- Research demos around 3D Gaussian Splatting
- Benchmarking recent 3D asset generation pipelines
- Building lightweight demos for Blender, Unity, or web-based 3D viewers

## ⚠️ Limitations

This model is a research-oriented 3D generation pipeline. It may produce imperfect geometry or artifacts in the following cases:

- Thin structures, transparent objects, wires, fur, or complex topology
- Highly reflective or texture-heavy objects
- Ambiguous single-view inputs where the back side is not visible
- Prompt-only generation requiring precise physical dimensions
- Production workflows requiring clean quad meshes, rigging, or CAD-level topology

For professional 3D asset production, additional post-processing may be needed, such as mesh extraction, topology cleanup, UV unwrapping, material editing, or manual refinement.

## 🧪 Tips for Better Results

Good prompts usually describe:

```text
object category + style + material + lighting + geometry constraint
```

Examples:

```text
a cute robot, rounded toy design, smooth plastic material, studio lighting
a medieval treasure chest, golden metal details, wooden texture, clean geometry
a sci-fi helmet, hard-surface design, matte black material, sharp edges
a tiny house, stylized low-poly, warm colors, isometric game asset
```

For image-to-3D, use images with:

- A single centered object
- Clean background
- Clear object silhouette
- Minimal occlusion
- Good lighting

## 🔗 Related Links

- Original paper: https://arxiv.org/abs/2402.05054
- Original project page: https://me.kiui.moe/lgm/
- Original GitHub repository: https://github.com/3DTopia/LGM
- Upstream Hugging Face model: https://huggingface.co/dylanebert/LGM-full

## 🙏 Acknowledgements

This repository is based on the LGM ecosystem and the upstream Hugging Face full pipeline release. Full credit for the original LGM method goes to the authors of:

**LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation**

This release is intended as a convenient Hugging Face / Diffusers-compatible resource for research, education, and rapid experimentation.


<div align="center">

### 🐙 Built for fast 3D generation experiments.

**From prompt or image to 3D Gaussian assets — clean, simple, and research-friendly.**

</div>