| | --- |
| | license: apache-2.0 |
| | library_name: diffusers |
| | pipeline_tag: text-to-image |
| | tags: |
| | - remote-sensing |
| | - diffusion |
| | - controlnet |
| | - custom-pipeline |
| | language: |
| | - en |
| | --- |
| | |
| | > [!WARNING] we do not have a full checkpoint conversion validation, if you encounter pipeline loading failure and unsidered output, please contact me via bili_sakura@zju.edu.cn |
| | |
| | # BiliSakura/CRS-Diff |
| | |
| | Diffusers-style packaging for the CRS-Diff checkpoint, with a custom Hugging Face `DiffusionPipeline` implementation. |
| | |
| | ## Model Details |
| | |
| | - **Base project**: `CRS-Diff` (Controllable Remote Sensing Image Generation with Diffusion Model) |
| | - **Checkpoint source**: `/root/worksapce/models/raw/CRS-Diff/last.ckpt` |
| | - **Pipeline class**: `CRSDiffPipeline` (in `pipeline.py`) |
| | - **Scheduler**: `DDIMScheduler` |
| | - **Resolution**: 512x512 (default in training/inference config) |
| | |
| | ## Repository Structure |
| | |
| | ```text |
| | CRS-Diff/ |
| | pipeline.py |
| | modular_pipeline.py |
| | crs_core/ |
| | autoencoder.py |
| | text_encoder.py |
| | local_adapter.py |
| | global_adapter.py |
| | metadata_embedding.py |
| | modules/ |
| | model_index.json |
| | scheduler/ |
| | scheduler_config.json |
| | unet/ |
| | vae/ |
| | text_encoder/ |
| | local_adapter/ |
| | global_content_adapter/ |
| | global_text_adapter/ |
| | metadata_encoder/ |
| | ``` |
| | |
| | ## Usage |
| |
|
| | Install dependencies first: |
| |
|
| | ```bash |
| | pip install diffusers transformers torch torchvision omegaconf einops safetensors pytorch-lightning |
| | ``` |
| |
|
| | Load the pipeline (local path or Hub repo), then run inference: |
| |
|
| | ```python |
| | import torch |
| | import numpy as np |
| | from diffusers import DiffusionPipeline |
| | |
| | pipe = DiffusionPipeline.from_pretrained( |
| | "/root/worksapce/models/BiliSakura/CRS-Diff", |
| | custom_pipeline="pipeline.py", |
| | trust_remote_code=True, |
| | model_path="/root/worksapce/models/BiliSakura/CRS-Diff", |
| | ) |
| | pipe = pipe.to("cuda") |
| | |
| | # Example placeholder controls; replace with real CRS controls. |
| | b = 1 |
| | local_control = torch.zeros((b, 18, 512, 512), device="cuda", dtype=torch.float32) |
| | global_control = torch.zeros((b, 1536), device="cuda", dtype=torch.float32) |
| | metadata = torch.zeros((b, 7), device="cuda", dtype=torch.float32) |
| | |
| | out = pipe( |
| | prompt=["a remote sensing image of an urban area"], |
| | negative_prompt=["blurry, distorted, overexposed"], |
| | local_control=local_control, |
| | global_control=global_control, |
| | metadata=metadata, |
| | num_inference_steps=50, |
| | guidance_scale=7.5, |
| | eta=0.0, |
| | strength=1.0, |
| | global_strength=1.0, |
| | output_type="pil", |
| | ) |
| | image = out.images[0] |
| | image.save("crs_diff_sample.png") |
| | ``` |
| |
|
| | ## Notes |
| |
|
| | - This repository is packaged in a diffusers-compatible layout with a custom pipeline. |
| | - Loading path follows the same placeholder-aware custom pipeline pattern as HSIGene. |
| | - Split component weights are provided in diffusers-style folders (`unet/`, `vae/`, adapters, and encoders). |
| | - Monolithic `crs_model/last.ckpt` fallback is intentionally removed; this repo is split-components only. |
| | - Legacy external source trees (`models/`, `ldm/`) are removed; runtime code is in lightweight `crs_core/`. |
| | - `CRSDiffPipeline` expects CRS-specific condition tensors (`local_control`, `global_control`, `metadata`). |
| | - If you publish to Hugging Face Hub, keep `trust_remote_code=True` when loading. |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @article{tang2024crs, |
| | title={Crs-diff: Controllable remote sensing image generation with diffusion model}, |
| | author={Tang, Datao and Cao, Xiangyong and Hou, Xingsong and Jiang, Zhongyuan and Liu, Junmin and Meng, Deyu}, |
| | journal={IEEE Transactions on Geoscience and Remote Sensing}, |
| | year={2024}, |
| | publisher={IEEE} |
| | } |
| | ``` |
| |
|