Duplicated from dylanebert/LGM-full

WasabiOctopus
/

LGM

@@ -1,12 +1,249 @@
 ---
-license: mit
-pipeline_tag: image-to-3d
 ---
-# LGM Full
-This custom pipeline encapsulates the full [LGM](https://huggingface.co/ashawkey/LGM) pipeline, including [multi-view diffusion](https://huggingface.co/ashawkey/imagedream-ipmv-diffusers).
-It is provided as a resource for the [ML for 3D Course](https://huggingface.co/learn/ml-for-3d-course).
-Original LGM paper: [LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation](https://huggingface.co/papers/2402.05054).

+<div align="center">
+# 🐙 WasabiOctopus / LGM
+### Large Multi-View Gaussian Model for Fast 3D Asset Generation
+<p>
+  <img src="https://img.shields.io/badge/Task-Image--to--3D-blueviolet">
+  <img src="https://img.shields.io/badge/Task-Text--to--3D-8A2BE2">
+  <img src="https://img.shields.io/badge/Representation-3D%20Gaussian%20Splatting-orange">
+  <img src="https://img.shields.io/badge/Library-Diffusers-yellow">
+  <img src="https://img.shields.io/badge/License-MIT-green">
+</p>
+**A Diffusers-ready LGM pipeline for fast 3D content creation from text or a single image.**
+</div>
+---
+## ✨ Highlights
+* 🚀 **Fast 3D asset generation** powered by the LGM pipeline.
+* 🧊 **3D Gaussian Splatting representation** for efficient high-resolution 3D content.
+* 🖼️ **Text-to-3D and image-to-3D workflows** through multi-view diffusion.
+* 🧩 **Diffusers-compatible model structure** with `LGMFullPipeline`.
+* 🔬 Useful for **3D generation research, creative prototyping, course projects, and rapid experimentation**.
+---
+## 🖼️ Gallery
+> Upload your own generated examples to an `assets/` folder and replace the placeholders below.
+| Prompt / Input                                        | Generated 3D Asset |
+| ----------------------------------------------------- | ------------------ |
+| `a cute robot, smooth toy material, studio lighting`  | Coming soon        |
+| `a fantasy treasure chest with golden details`        | Coming soon        |
+| `a stylized sci-fi helmet, clean hard-surface design` | Coming soon        |
+---
+## 🧠 What is LGM?
+**LGM**, short for **Large Multi-View Gaussian Model**, is a 3D generation framework designed for high-resolution 3D content creation.
+Instead of directly generating a mesh from scratch, the pipeline first produces multi-view visual information and then reconstructs a 3D Gaussian representation. This makes it suitable for fast, feed-forward 3D asset generation from either a text prompt or a single input image.
+This repository provides a convenient Hugging Face / Diffusers-style release of the full LGM pipeline.
 ---
+## 🏗️ Pipeline Overview
+```text
+Text prompt or single image
+        ↓
+Multi-view diffusion generation
+        ↓
+Multi-view Gaussian features
+        ↓
+LGM reconstruction module
+        ↓
+3D Gaussian asset
+        ↓
+PLY export / downstream rendering
+```
+---
+## 🚀 Quick Start
+### 1. Install dependencies
+```bash
+pip install -U diffusers transformers accelerate safetensors
+pip install torch torchvision torchaudio
+pip install xformers trimesh kiui plyfile
+```
+For the full environment, check the repository `requirements.txt`.
+### 2. Load the pipeline
+```python
+import torch
+from diffusers import DiffusionPipeline
+repo_id = "WasabiOctopus/LGM"
+pipe = DiffusionPipeline.from_pretrained(
+    repo_id,
+    torch_dtype=torch.float16,
+    trust_remote_code=True,
+)
+pipe = pipe.to("cuda")
+```
+### 3. Text-to-3D generation
+```python
+prompt = "a cute robot, smooth toy material, studio lighting, clean geometry"
+gaussians = pipe(
+    prompt=prompt,
+    num_inference_steps=50,
+    guidance_scale=7.0,
+)
+pipe.save_ply(gaussians, "robot.ply")
+```
+### 4. Image-to-3D generation
+```python
+import numpy as np
+from PIL import Image
+image = Image.open("input.png").convert("RGB").resize((256, 256))
+image = np.array(image).astype(np.float32) / 255.0
+gaussians = pipe(
+    prompt="",
+    image=image,
+    num_inference_steps=50,
+    guidance_scale=7.0,
+)
+pipe.save_ply(gaussians, "asset_from_image.ply")
+```
+---
+## 📦 Repository Contents
+```text
+WasabiOctopus/LGM
+├── README.md
+├── model_index.json
+├── pipeline.py
+├── requirements.txt
+├── feature_extractor/
+├── image_encoder/
+├── text_encoder/
+├── tokenizer/
+├── scheduler/
+├── vae/
+├── unet/
+└── lgm/
+```
+---
+## 💡 Recommended Use Cases
+This model release is useful for:
+* Fast **single-image-to-3D** prototyping
+* **Text-to-3D** creative asset generation
+* 3D generation course projects
+* Research demos around 3D Gaussian Splatting
+* Benchmarking recent 3D asset generation pipelines
+* Building lightweight demos for Blender, Unity, or web-based 3D viewers
+---
+## ⚠️ Limitations
+This model is a research-oriented 3D generation pipeline. It may produce imperfect geometry or artifacts in the following cases:
+* Thin structures, transparent objects, wires, fur, or complex topology
+* Highly reflective or texture-heavy objects
+* Ambiguous single-view inputs where the back side is not visible
+* Prompt-only generation requiring precise physical dimensions
+* Production workflows requiring clean quad meshes, rigging, or CAD-level topology
+For professional 3D asset production, additional post-processing may be needed, such as mesh extraction, topology cleanup, UV unwrapping, material editing, or manual refinement.
+---
+## 🧪 Tips for Better Results
+Good prompts usually describe:
+```text
+object category + style + material + lighting + geometry constraint
+```
+Examples:
+```text
+a cute robot, rounded toy design, smooth plastic material, studio lighting
+a medieval treasure chest, golden metal details, wooden texture, clean geometry
+a sci-fi helmet, hard-surface design, matte black material, sharp edges
+a tiny house, stylized low-poly, warm colors, isometric game asset
+```
+For image-to-3D, use images with:
+* A single centered object
+* Clean background
+* Clear object silhouette
+* Minimal occlusion
+* Good lighting
+---
+## 🔗 Related Links
+* Original paper: https://arxiv.org/abs/2402.05054
+* Original project page: https://me.kiui.moe/lgm/
+* Original GitHub repository: https://github.com/3DTopia/LGM
+* Upstream Hugging Face model: https://huggingface.co/dylanebert/LGM-full
+---
+## 🙏 Acknowledgements
+This repository is based on the LGM ecosystem and the upstream Hugging Face full pipeline release. Full credit for the original LGM method goes to the authors of:
+**LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation**
+This release is intended as a convenient Hugging Face / Diffusers-compatible resource for research, education, and rapid experimentation.
+---
+## 📚 Citation
+If you use this model or the original LGM method, please cite:
+```bibtex
+@article{tang2024lgm,
+  title={LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation},
+  author={Tang, Jiaxiang and Chen, Zhaoxi and Chen, Xiaokang and Wang, Tengfei and Zeng, Gang and Liu, Ziwei},
+  journal={arXiv preprint arXiv:2402.05054},
+  year={2024}
+}
+```
 ---
+<div align="center">
+### 🐙 Built for fast 3D generation experiments.
+**From prompt or image to 3D Gaussian assets — clean, simple, and research-friendly.**
+</div>