If you encounter pipeline loading failure or unexpected output, please contact bili_sakura@zju.edu.cn.

DiffusionSat Custom Pipelines

Custom community pipelines for loading DiffusionSat checkpoints directly with diffusers.DiffusionPipeline.from_pretrained().

See Diffusers Community Pipeline Documentation

Model Index

model_index.json is set to the default text-to-image pipeline (DiffusionSatPipeline) so DiffusionPipeline.from_pretrained() works out of the box. The ControlNet variant is loaded via custom_pipeline plus the controlnet subfolder, as shown below.

Available Pipelines

This directory contains two custom pipelines:

pipeline_diffusionsat.py: Standard text-to-image pipeline with DiffusionSat metadata support.
pipeline_diffusionsat_controlnet.py: ControlNet pipeline with DiffusionSat metadata and conditional metadata support.

Setup

The checkpoint folder (ckpt/diffusionsat/) should contain the standard diffusers components (unet, vae, scheduler, etc.). You can reference these pipeline files directly from this directory or copy them to your checkpoint folder.

Usage

1. Text-to-Image Pipeline

Use pipeline_diffusionsat.py for standard generation.

import torch
from diffusers import DiffusionPipeline

# Load pipeline
pipe = DiffusionPipeline.from_pretrained(
    "path/to/ckpt/diffusionsat",
    custom_pipeline="./pipeline_diffusionsat.py",  # Path to this file
    torch_dtype=torch.float16,
    trust_remote_code=True,
)
pipe = pipe.to("cuda")

# Optional: Metadata (normalized lat, lon, timestamp, GSD, etc.)
# metadata = [0.5, -0.3, 0.7, 0.2, 0.1, 0.0, 0.5] 

# Generate
image = pipe(
    "satellite image of farmland",
    metadata=None,  # Optional
    height=512,
    width=512,
    num_inference_steps=30,
).images[0]

2. ControlNet Pipeline

Use pipeline_diffusionsat_controlnet.py for ControlNet generation.

import torch
import numpy as np
from PIL import Image
from diffusers import DiffusionPipeline, ControlNetModel

# 1. Load the fMoW 2D ControlNet
controlnet = ControlNetModel.from_pretrained(
    "path/to/ckpt/diffusionsat/controlnet",
    torch_dtype=torch.float16,
    conditioning_channels=10,
)

# 2. Load pipeline with ControlNet
pipe = DiffusionPipeline.from_pretrained(
    "path/to/ckpt/diffusionsat",
    controlnet=controlnet,
    custom_pipeline="./pipeline_diffusionsat_controlnet.py",  # Path to this file
    torch_dtype=torch.float16,
    trust_remote_code=True,
)
pipe = pipe.to("cuda")

# 3. Prepare 10-channel conditioning tensor (RGB + 7 zero channels)
control_image = Image.open("path/to/conditioning_image.png").convert("RGB").resize((256, 256))
rgb = torch.from_numpy(np.array(control_image)).permute(2, 0, 1).unsqueeze(0).float() / 255.0
extra = torch.zeros((1, 7, 256, 256), dtype=rgb.dtype)
control_tensor = torch.cat([rgb, extra], dim=1).to(device="cuda", dtype=torch.float16)

# 4. Generate
image = pipe(
    prompt="satellite image of farmland",
    image=control_tensor,
    metadata=None,
    cond_metadata=None,
    height=256,
    width=256,
    num_inference_steps=30,
).images[0]

Downloads last month: 23

Collection including BiliSakura/DiffusionSat-SR-fMoW-Sentinel-512

Remote Sensing Visual Generative Models

Collection

diffusers implementation • 24 items • Updated Mar 8 • 1