If you encounter pipeline loading failure or unexpected output, please contact bili_sakura@zju.edu.cn.
DiffusionSat Custom Pipelines
Custom community pipelines for loading DiffusionSat checkpoints directly with diffusers.DiffusionPipeline.from_pretrained().
Model Index
model_index.json is set to the default text-to-image pipeline (DiffusionSatPipeline) so DiffusionPipeline.from_pretrained() works out of the box. The ControlNet variant is loaded via custom_pipeline plus the controlnet subfolder, as shown below.
Available Pipelines
This directory contains two custom pipelines:
pipeline_diffusionsat.py: Standard text-to-image pipeline with DiffusionSat metadata support.pipeline_diffusionsat_controlnet.py: ControlNet pipeline with DiffusionSat metadata and conditional metadata support.
Setup
The checkpoint folder (ckpt/diffusionsat/) should contain the standard diffusers components (unet, vae, scheduler, etc.). You can reference these pipeline files directly from this directory or copy them to your checkpoint folder.
Usage
1. Text-to-Image Pipeline
Use pipeline_diffusionsat.py for standard generation.
import torch
from diffusers import DiffusionPipeline
# Load pipeline
pipe = DiffusionPipeline.from_pretrained(
"path/to/ckpt/diffusionsat",
custom_pipeline="./pipeline_diffusionsat.py", # Path to this file
torch_dtype=torch.float16,
trust_remote_code=True,
)
pipe = pipe.to("cuda")
# Optional: Metadata (normalized lat, lon, timestamp, GSD, etc.)
# metadata = [0.5, -0.3, 0.7, 0.2, 0.1, 0.0, 0.5]
# Generate
image = pipe(
"satellite image of farmland",
metadata=None, # Optional
height=512,
width=512,
num_inference_steps=30,
).images[0]
2. ControlNet Pipeline
Use pipeline_diffusionsat_controlnet.py for ControlNet generation.
import torch
import numpy as np
from PIL import Image
from diffusers import DiffusionPipeline, ControlNetModel
# 1. Load the fMoW 2D ControlNet
controlnet = ControlNetModel.from_pretrained(
"path/to/ckpt/diffusionsat/controlnet",
torch_dtype=torch.float16,
conditioning_channels=10,
)
# 2. Load pipeline with ControlNet
pipe = DiffusionPipeline.from_pretrained(
"path/to/ckpt/diffusionsat",
controlnet=controlnet,
custom_pipeline="./pipeline_diffusionsat_controlnet.py", # Path to this file
torch_dtype=torch.float16,
trust_remote_code=True,
)
pipe = pipe.to("cuda")
# 3. Prepare 10-channel conditioning tensor (RGB + 7 zero channels)
control_image = Image.open("path/to/conditioning_image.png").convert("RGB").resize((256, 256))
rgb = torch.from_numpy(np.array(control_image)).permute(2, 0, 1).unsqueeze(0).float() / 255.0
extra = torch.zeros((1, 7, 256, 256), dtype=rgb.dtype)
control_tensor = torch.cat([rgb, extra], dim=1).to(device="cuda", dtype=torch.float16)
# 4. Generate
image = pipe(
prompt="satellite image of farmland",
image=control_tensor,
metadata=None,
cond_metadata=None,
height=256,
width=256,
num_inference_steps=30,
).images[0]
- Downloads last month
- 23