BiliSakura
/

CRS-Diff

CRSDiffPipeline

custom-pipeline

Model card Files Files and versions

CRS-Diff / README.md

BiliSakura's picture

Add files using upload-large-folder tool

b6acc0a verified 2 days ago

|

history blame contribute delete

3.53 kB

	---
	license: apache-2.0
	library_name: diffusers
	pipeline_tag: text-to-image
	tags:
	- remote-sensing
	- diffusion
	- controlnet
	- custom-pipeline
	language:
	- en
	---

	> [!WARNING] we do not have a full checkpoint conversion validation, if you encounter pipeline loading failure and unsidered output, please contact me via bili_sakura@zju.edu.cn

	# BiliSakura/CRS-Diff

	Diffusers-style packaging for the CRS-Diff checkpoint, with a custom Hugging Face `DiffusionPipeline` implementation.

	## Model Details

	- Base project: `CRS-Diff` (Controllable Remote Sensing Image Generation with Diffusion Model)
	- Checkpoint source: `/root/worksapce/models/raw/CRS-Diff/last.ckpt`
	- Pipeline class: `CRSDiffPipeline` (in `pipeline.py`)
	- Scheduler: `DDIMScheduler`
	- Resolution: 512x512 (default in training/inference config)

	## Repository Structure

	```text
	CRS-Diff/
	pipeline.py
	modular_pipeline.py
	crs_core/
	autoencoder.py
	text_encoder.py
	local_adapter.py
	global_adapter.py
	metadata_embedding.py
	modules/
	model_index.json
	scheduler/
	scheduler_config.json
	unet/
	vae/
	text_encoder/
	local_adapter/
	global_content_adapter/
	global_text_adapter/
	metadata_encoder/
	```

	## Usage

	Install dependencies first:

	```bash
	pip install diffusers transformers torch torchvision omegaconf einops safetensors pytorch-lightning
	```

	Load the pipeline (local path or Hub repo), then run inference:

	```python
	import torch
	import numpy as np
	from diffusers import DiffusionPipeline

	pipe = DiffusionPipeline.from_pretrained(
	"/root/worksapce/models/BiliSakura/CRS-Diff",
	custom_pipeline="pipeline.py",
	trust_remote_code=True,
	model_path="/root/worksapce/models/BiliSakura/CRS-Diff",
	)
	pipe = pipe.to("cuda")

	# Example placeholder controls; replace with real CRS controls.
	b = 1
	local_control = torch.zeros((b, 18, 512, 512), device="cuda", dtype=torch.float32)
	global_control = torch.zeros((b, 1536), device="cuda", dtype=torch.float32)
	metadata = torch.zeros((b, 7), device="cuda", dtype=torch.float32)

	out = pipe(
	prompt=["a remote sensing image of an urban area"],
	negative_prompt=["blurry, distorted, overexposed"],
	local_control=local_control,
	global_control=global_control,
	metadata=metadata,
	num_inference_steps=50,
	guidance_scale=7.5,
	eta=0.0,
	strength=1.0,
	global_strength=1.0,
	output_type="pil",
	)
	image = out.images[0]
	image.save("crs_diff_sample.png")
	```

	## Notes

	- This repository is packaged in a diffusers-compatible layout with a custom pipeline.
	- Loading path follows the same placeholder-aware custom pipeline pattern as HSIGene.
	- Split component weights are provided in diffusers-style folders (`unet/`, `vae/`, adapters, and encoders).
	- Monolithic `crs_model/last.ckpt` fallback is intentionally removed; this repo is split-components only.
	- Legacy external source trees (`models/`, `ldm/`) are removed; runtime code is in lightweight `crs_core/`.
	- `CRSDiffPipeline` expects CRS-specific condition tensors (`local_control`, `global_control`, `metadata`).
	- If you publish to Hugging Face Hub, keep `trust_remote_code=True` when loading.

	## Citation

	```bibtex
	@article{tang2024crs,
	title={Crs-diff: Controllable remote sensing image generation with diffusion model},
	author={Tang, Datao and Cao, Xiangyong and Hou, Xingsong and Jiang, Zhongyuan and Liu, Junmin and Meng, Deyu},
	journal={IEEE Transactions on Geoscience and Remote Sensing},
	year={2024},
	publisher={IEEE}
	}
	```