Remote Sensing Visual Generative Models
Collection
diffusers implementation • 24 items • Updated
• 1
we do not have a full checkpoint conversion validation, if you encounter pipeline loading failure and unsidered output, please contact me via bili_sakura@zju.edu.cn
Diffusers-format BRCA variant of ZoomLDM with a bundled custom pipeline and local ldm modules.
UNet + VAE + conditioning encoder)0..4)DiffusionPipeline.from_pretrained(...)Use this model for conditional multi-scale BRCA patch generation when you have compatible pre-extracted SSL features.
unet/, vae/, conditioning_encoder/, scheduler/model_index.jsonpipeline_zoomldm.pyldm/ (bundled dependency modules)import torch
from diffusers import DiffusionPipeline
pipe = DiffusionPipeline.from_pretrained(
"BiliSakura/ZoomLDM-brca",
custom_pipeline="pipeline_zoomldm.py",
trust_remote_code=True,
).to("cuda")
out = pipe(
ssl_features=ssl_feat_tensor.to("cuda"), # BRCA UNI-style SSL embeddings
magnification=torch.tensor([0]).to("cuda"), # 0..4
num_inference_steps=50,
guidance_scale=2.0,
)
images = out.images
This repo includes run_demo_inference.py, which uses local repo assets only:
demo_images/input.jpegdemo_data/0_ssl_feat.npy0Run:
python run_demo_inference.py
@InProceedings{Yellapragada_2025_CVPR,
author = {Yellapragada, Srikar and Graikos, Alexandros and Triaridis, Kostas and Prasanna, Prateek and Gupta, Rajarsi and Saltz, Joel and Samaras, Dimitris},
title = {ZoomLDM: Latent Diffusion Model for Multi-scale Image Generation},
booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
month = {June},
year = {2025},
pages = {23453-23463}
}