we do not have a full checkpoint conversion validation, if you encounter pipeline loading failure and unsidered output, please contact me via bili_sakura@zju.edu.cn

BiliSakura/CRS-Diff

Diffusers-style packaging for the CRS-Diff checkpoint, with a custom Hugging Face DiffusionPipeline implementation.

Model Details

Base project: CRS-Diff (Controllable Remote Sensing Image Generation with Diffusion Model)
Checkpoint source: /root/worksapce/models/raw/CRS-Diff/last.ckpt
Pipeline class: CRSDiffPipeline (in pipeline.py)
Scheduler: DDIMScheduler
Resolution: 512x512 (default in training/inference config)

Repository Structure

CRS-Diff/
  pipeline.py
  modular_pipeline.py
  crs_core/
    autoencoder.py
    text_encoder.py
    local_adapter.py
    global_adapter.py
    metadata_embedding.py
    modules/
  model_index.json
  scheduler/
    scheduler_config.json
  unet/
  vae/
  text_encoder/
  local_adapter/
  global_content_adapter/
  global_text_adapter/
  metadata_encoder/

Usage

Install dependencies first:

pip install diffusers transformers torch torchvision omegaconf einops safetensors pytorch-lightning

Load the pipeline (local path or Hub repo), then run inference:

import torch
import numpy as np
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
    "/root/worksapce/models/BiliSakura/CRS-Diff",
    custom_pipeline="pipeline.py",
    trust_remote_code=True,
    model_path="/root/worksapce/models/BiliSakura/CRS-Diff",
)
pipe = pipe.to("cuda")

# Example placeholder controls; replace with real CRS controls.
b = 1
local_control = torch.zeros((b, 18, 512, 512), device="cuda", dtype=torch.float32)
global_control = torch.zeros((b, 1536), device="cuda", dtype=torch.float32)
metadata = torch.zeros((b, 7), device="cuda", dtype=torch.float32)

out = pipe(
    prompt=["a remote sensing image of an urban area"],
    negative_prompt=["blurry, distorted, overexposed"],
    local_control=local_control,
    global_control=global_control,
    metadata=metadata,
    num_inference_steps=50,
    guidance_scale=7.5,
    eta=0.0,
    strength=1.0,
    global_strength=1.0,
    output_type="pil",
)
image = out.images[0]
image.save("crs_diff_sample.png")

Notes

This repository is packaged in a diffusers-compatible layout with a custom pipeline.
Loading path follows the same placeholder-aware custom pipeline pattern as HSIGene.
Split component weights are provided in diffusers-style folders (unet/, vae/, adapters, and encoders).
Monolithic crs_model/last.ckpt fallback is intentionally removed; this repo is split-components only.
Legacy external source trees (models/, ldm/) are removed; runtime code is in lightweight crs_core/.
CRSDiffPipeline expects CRS-specific condition tensors (local_control, global_control, metadata).
If you publish to Hugging Face Hub, keep trust_remote_code=True when loading.

Citation

@article{tang2024crs,
  title={Crs-diff: Controllable remote sensing image generation with diffusion model},
  author={Tang, Datao and Cao, Xiangyong and Hou, Xingsong and Jiang, Zhongyuan and Liu, Junmin and Meng, Deyu},
  journal={IEEE Transactions on Geoscience and Remote Sensing},
  year={2024},
  publisher={IEEE}
}

Downloads last month: 4