File size: 6,711 Bytes

eea05f1
 
 
 
76bfabb
 
 
eea05f1
 
d7af4f8
 
 
 
 
 
 
 
 
 
 
 
 
 
76bfabb
d7af4f8
 
 
 
 
76bfabb
 
 
 
 
d7af4f8
 
 
76bfabb
d7af4f8
76bfabb
 
 
 
 
d7af4f8
 
 
76bfabb
d7af4f8
76bfabb
d7af4f8
 
 
 
 
76bfabb
d7af4f8
76bfabb
 
 
 
 
 
 
 
 
 
 
d7af4f8
 
 
 
 
76bfabb
d7af4f8
 
 
 
 
76bfabb
 
d7af4f8
 
76bfabb
d7af4f8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76bfabb
d7af4f8
 
 
 
 
 
 
 
 
 
 
 
 
76bfabb
d7af4f8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76bfabb
d7af4f8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
76bfabb
 
 
 
 
 
d7af4f8
 
 
 
 
76bfabb
 
 
 
 
d7af4f8
 
 
 
 
 
 
76bfabb
 
 
d7af4f8
 
 
76bfabb
 
 
 
 
 
d7af4f8
 
 
76bfabb
 
 
 
 
d7af4f8
 
 
76bfabb
 
 
 
d7af4f8
 
 
76bfabb
d7af4f8
 
 
 
 
 
 
52e2add
d7af4f8
52e2add
d7af4f8
52e2add
76bfabb

---

license: mit
pipeline_tag: image-to-3d
library_name: diffusers
base_model: "dylanebert/LGM-full"
tags: ["image-to-3d", "text-to-3d", "3d-generation", "3d-gaussian-splatting", "gaussian-splatting", "multi-view-diffusion", "lgm", "diffusers", "safetensors", "objaverse", "research", "computer-graphics"]
arxiv: "2402.05054"
---


<div align="center">

# 🐙 WasabiOctopus / LGM

### Large Multi-View Gaussian Model for Fast 3D Asset Generation

<p>
  <img src="https://img.shields.io/badge/Task-Image--to--3D-blueviolet">
  <img src="https://img.shields.io/badge/Task-Text--to--3D-8A2BE2">
  <img src="https://img.shields.io/badge/Representation-3D%20Gaussian%20Splatting-orange">
  <img src="https://img.shields.io/badge/Library-Diffusers-yellow">
  <img src="https://img.shields.io/badge/License-MIT-green">
</p>

**A Diffusers-ready LGM pipeline for fast 3D content creation from text or a single image.**

</div>

## ✨ Highlights

- 🚀 **Fast 3D asset generation** powered by the LGM pipeline.
- 🧊 **3D Gaussian Splatting representation** for efficient high-resolution 3D content.
- 🖼️ **Text-to-3D and image-to-3D workflows** through multi-view diffusion.
- 🧩 **Diffusers-compatible model structure** with `LGMFullPipeline`.
- 🔬 Useful for **3D generation research, creative prototyping, course projects, and rapid experimentation**.

## 🖼️ Gallery

> Upload your own generated examples to an `assets/` folder and replace the placeholders below.

| Prompt / Input | Generated 3D Asset |
|---|---|
| `a cute robot, smooth toy material, studio lighting` | Coming soon |
| `a fantasy treasure chest with golden details` | Coming soon |
| `a stylized sci-fi helmet, clean hard-surface design` | Coming soon |

## 🧠 What is LGM?

**LGM**, short for **Large Multi-View Gaussian Model**, is a 3D generation framework designed for high-resolution 3D content creation.

Instead of directly generating a mesh from scratch, the pipeline first produces multi-view visual information and then reconstructs a 3D Gaussian representation. This makes it suitable for fast, feed-forward 3D asset generation from either a text prompt or a single input image.

This repository provides a convenient Hugging Face / Diffusers-style release of the full LGM pipeline.

## 🏗️ Pipeline Overview

```text

Text prompt or single image

        ↓

Multi-view diffusion generation

        ↓

Multi-view Gaussian features

        ↓

LGM reconstruction module

        ↓

3D Gaussian asset

        ↓

PLY export / downstream rendering

```

## 🚀 Quick Start

### 1. Install dependencies

```bash

pip install -U diffusers transformers accelerate safetensors

pip install torch torchvision torchaudio

pip install xformers trimesh kiui plyfile

```

For the full environment, check the repository `requirements.txt`.

### 2. Load the pipeline

```python

import torch

from diffusers import DiffusionPipeline



repo_id = "WasabiOctopus/LGM"



pipe = DiffusionPipeline.from_pretrained(

    repo_id,

    torch_dtype=torch.float16,

    trust_remote_code=True,

)



pipe = pipe.to("cuda")

```

### 3. Text-to-3D generation

```python

prompt = "a cute robot, smooth toy material, studio lighting, clean geometry"



gaussians = pipe(

    prompt=prompt,

    num_inference_steps=50,

    guidance_scale=7.0,

)



pipe.save_ply(gaussians, "robot.ply")

```

### 4. Image-to-3D generation

```python

import numpy as np

from PIL import Image



image = Image.open("input.png").convert("RGB").resize((256, 256))

image = np.array(image).astype(np.float32) / 255.0



gaussians = pipe(

    prompt="",

    image=image,

    num_inference_steps=50,

    guidance_scale=7.0,

)



pipe.save_ply(gaussians, "asset_from_image.ply")

```

## 📦 Repository Contents

```text

WasabiOctopus/LGM

├── README.md

├── model_index.json

├── pipeline.py

├── requirements.txt

├── feature_extractor/

├── image_encoder/

├── text_encoder/

├── tokenizer/

├── scheduler/

├── vae/

├── unet/

└── lgm/

```

## 💡 Recommended Use Cases

This model release is useful for:

- Fast **single-image-to-3D** prototyping
- **Text-to-3D** creative asset generation
- 3D generation course projects
- Research demos around 3D Gaussian Splatting
- Benchmarking recent 3D asset generation pipelines
- Building lightweight demos for Blender, Unity, or web-based 3D viewers

## ⚠️ Limitations

This model is a research-oriented 3D generation pipeline. It may produce imperfect geometry or artifacts in the following cases:

- Thin structures, transparent objects, wires, fur, or complex topology
- Highly reflective or texture-heavy objects
- Ambiguous single-view inputs where the back side is not visible
- Prompt-only generation requiring precise physical dimensions
- Production workflows requiring clean quad meshes, rigging, or CAD-level topology

For professional 3D asset production, additional post-processing may be needed, such as mesh extraction, topology cleanup, UV unwrapping, material editing, or manual refinement.

## 🧪 Tips for Better Results

Good prompts usually describe:

```text

object category + style + material + lighting + geometry constraint

```

Examples:

```text

a cute robot, rounded toy design, smooth plastic material, studio lighting

a medieval treasure chest, golden metal details, wooden texture, clean geometry

a sci-fi helmet, hard-surface design, matte black material, sharp edges

a tiny house, stylized low-poly, warm colors, isometric game asset

```

For image-to-3D, use images with:

- A single centered object
- Clean background
- Clear object silhouette
- Minimal occlusion
- Good lighting

## 🔗 Related Links

- Original paper: https://arxiv.org/abs/2402.05054
- Original project page: https://me.kiui.moe/lgm/
- Original GitHub repository: https://github.com/3DTopia/LGM
- Upstream Hugging Face model: https://huggingface.co/dylanebert/LGM-full

## 🙏 Acknowledgements

This repository is based on the LGM ecosystem and the upstream Hugging Face full pipeline release. Full credit for the original LGM method goes to the authors of:

**LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation**

This release is intended as a convenient Hugging Face / Diffusers-compatible resource for research, education, and rapid experimentation.


<div align="center">

### 🐙 Built for fast 3D generation experiments.

**From prompt or image to 3D Gaussian assets — clean, simple, and research-friendly.**

</div>