Latent Diffusion Model β LoDoChallenge (DM4CT)
This repository contains the pretrained latent-space diffusion model used in the
DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026) benchmark.
π Paper: https://openreview.net/forum?id=YE5scJekg5
π Arxiv: https://arxiv.org/abs/2602.18589
π Codebase: https://github.com/DM4CT/DM4CT
π¬ Model Overview
This model learns a prior over CT reconstruction images in a compressed latent space using a denoising diffusion probabilistic model (DDPM).
Unlike the pixel diffusion model, diffusion is performed in the latent space of a pretrained autoencoder.
- Architecture:
- VQ-VAE (image encoder/decoder)
- 2D UNet operating in latent space
- Input resolution (image space): 512 Γ 512
- Latent resolution: (insert latent size, e.g., 64 Γ 64)
- Channels: 1 (grayscale CT slice)
- Training objective: Ξ΅-prediction (standard DDPM formulation)
- Noise schedule: Linear beta schedule
- Training dataset: Low Dose Grand Challenge (LoDoChallenge)
- Intensity normalization: Rescaled to (-1, 1)
The diffusion model operates purely in latent space and relies on the autoencoder for encoding and decoding.
This model is intended to be combined with data-consistency correction for CT reconstruction.
π Dataset: Low Dose Grand Challenge
Source:
https://www.aapm.org/grandchallenge/lowdosect/
Preprocessing steps:
- Train/test split
- Rescale reconstructed slices to (-1, 1)
- No geometry information is embedded in the model
The model learns an unconditional latent prior over CT slices.
π§ Training Details
- Optimizer: AdamW
- Learning rate: 1e-4
- Batch size: (insert your batch size)
- Training steps: (insert number of steps)
- Hardware: NVIDIA A100 GPU
Training scripts:
- Latent diffusion: https://github.com/DM4CT/DM4CT/blob/main/train_latent.py
- Autoencoder training: (insert if separate)
π Usage
from diffusers import LDMPipeline
LDMPipeline = DiffusionPipeline.from_pretrained(
"jiayangshi/lodochallenge_latent_diffusion"
)
pipeline.to("cuda")
- Downloads last month
- -