MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation
Paper β’ 2412.03558 β’ Published β’ 20
import torch
from diffusers import DiffusionPipeline
# switch to "mps" for apple devices
pipe = DiffusionPipeline.from_pretrained("VAST-AI/MIDI-3D", dtype=torch.bfloat16, device_map="cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipe(prompt).images[0]MIDI is a 3D generative model for single image to compositional 3D scene generation. It was introduced in the paper MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation.
Project page: https://huanngzh.github.io/MIDI-Page/