| | --- |
| | license: apache-2.0 |
| | library_name: diffusers |
| | pipeline_tag: unconditional-image-generation |
| | tags: |
| | - zoomldm |
| | - cdm |
| | - dit |
| | - histopathology |
| | - brca |
| | - custom-pipeline |
| | widget: |
| | - src: demo_images/input.jpeg |
| | prompt: Sample BRCA conditioning embedding (magnification class 0) |
| | output: |
| | url: demo_images/output.jpeg |
| | --- |
| | |
| | > [!WARNING] we do not have a full checkpoint conversion validation, if you encounter pipeline loading failure and unsidered output, please contact me via bili_sakura@zju.edu.cn |
| | |
| | # BiliSakura/ZoomLDM-CDM-brca |
| | |
| | Diffusers-style wrapped **CDM (DiT)** checkpoint for BRCA, converted from ZoomLDM `cdm_dit` training outputs. |
| |
|
| | ## Model Description |
| |
|
| | - **Architecture:** DiT-B style conditioning diffusion model (CDM) |
| | - **Domain:** BRCA conditioning space used by ZoomLDM |
| | - **Output:** conditioning tokens/embeddings (`(B, 512, 65)`) |
| | - **Format:** custom diffusers pipeline (`pipeline.py`) |
| |
|
| | ## Intended Use |
| |
|
| | Use this model to sample BRCA conditioning embeddings that can be consumed by downstream ZoomLDM workflows. |
| |
|
| | ## Out-of-Scope Use |
| |
|
| | - Not a complete pixel-space generator by itself. |
| | - Not intended for clinical or diagnostic use. |
| | - Not validated for non-BRCA domains without adaptation. |
| |
|
| | ## Files |
| |
|
| | - `pipeline.py`: custom `DiffusionPipeline` implementation (`CDMDiTPipeline`) |
| | - `model_index.json`: diffusers metadata |
| | - `cdm/`: active model weights/config used by pipeline |
| | - `scheduler/`: DDIM scheduler config |
| | - `model_raw.safetensors`: non-EMA training weights (optional) |
| | - `optimizer.pt`: optimizer state (optional) |
| | - `config.json`: conversion metadata |
| |
|
| | ## Usage |
| |
|
| | ```python |
| | import torch |
| | from diffusers import DiffusionPipeline |
| | |
| | pipe = DiffusionPipeline.from_pretrained( |
| | "BiliSakura/ZoomLDM-CDM-brca", |
| | custom_pipeline="pipeline.py", |
| | trust_remote_code=True, |
| | ).to("cuda") |
| | |
| | out = pipe( |
| | batch_size=2, |
| | magnification=torch.tensor([0, 0], device="cuda"), # class labels 0..7 |
| | num_inference_steps=50, |
| | guidance_scale=1.0, |
| | ) |
| | |
| | samples = out.samples # (B, 512, 65) |
| | ``` |
| |
|
| | ## Limitations |
| |
|
| | - Produces conditioning embeddings, not final images. |
| | - Requires correct class/magnification label conventions. |
| | - Inherits data biases and quality limits from the original training data. |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @InProceedings{Yellapragada_2025_CVPR, |
| | author = {Yellapragada, Srikar and Graikos, Alexandros and Triaridis, Kostas and Prasanna, Prateek and Gupta, Rajarsi and Saltz, Joel and Samaras, Dimitris}, |
| | title = {ZoomLDM: Latent Diffusion Model for Multi-scale Image Generation}, |
| | booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)}, |
| | month = {June}, |
| | year = {2025}, |
| | pages = {23453-23463} |
| | } |
| | |
| | @inproceedings{Peebles2023DiT, |
| | title={Scalable Diffusion Models with Transformers}, |
| | author={Peebles, William and Xie, Saining}, |
| | booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)}, |
| | year={2023} |
| | } |
| | ``` |
| |
|