| | --- |
| | library_name: diffusers |
| | license: mit |
| | pipeline_tag: image-to-image |
| | tags: |
| | - computed-tomography |
| | - ct-reconstruction |
| | - diffusion-model |
| | - inverse-problems |
| | - dm4ct |
| | - sparse-view-ct |
| | --- |
| | |
| | # Pixel Diffusion UNet β LoDoInd (DM4CT) |
| |
|
| | This repository contains the pretrained **pixel-space diffusion UNet** used in the benchmark study **DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026)**. |
| |
|
| | - **Paper:** [DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction](https://huggingface.co/papers/2602.18589) |
| | - **ArXiv:** [https://arxiv.org/abs/2602.18589](https://arxiv.org/abs/2602.18589) |
| | - **Project Page:** [https://dm4ct.github.io/DM4CT/](https://dm4ct.github.io/DM4CT/) |
| | - **Codebase:** [https://github.com/DM4CT/DM4CT](https://github.com/DM4CT/DM4CT) |
| |
|
| | --- |
| |
|
| | ## π¬ Model Overview |
| |
|
| | This model learns a **prior over CT reconstruction images** using a denoising diffusion probabilistic model (DDPM). It operates directly in **pixel space** (not latent space). |
| |
|
| | - **Architecture**: 2D UNet (Diffusers `UNet2DModel`) |
| | - **Input resolution**: 512 Γ 512 |
| | - **Channels**: 1 (grayscale CT slice) |
| | - **Training objective**: Ξ΅-prediction (standard DDPM formulation) |
| | - **Noise schedule**: Linear beta schedule |
| | - **Training dataset**: Industry CT dataset (LoDoInd) |
| | - **Intensity normalization**: Rescaled to (-1, 1) |
| |
|
| | This model is intended to be combined with data-consistency correction for CT reconstruction tasks. |
| |
|
| | --- |
| |
|
| | ## π Dataset: LoDoInd |
| |
|
| | Source: [LoDoInd on Zenodo](https://zenodo.org/records/10391412) |
| |
|
| | Preprocessing steps: |
| | - Train/test split |
| | - Rescale reconstructed slices to (-1, 1) |
| | - No geometry information is embedded in the model |
| |
|
| | The model learns an unconditional image prior over CT slices. |
| |
|
| | --- |
| |
|
| | ## π§ Training Details |
| |
|
| | - **Optimizer:** AdamW |
| | - **Learning rate:** 1e-4 |
| | - **Hardware:** NVIDIA A100 GPU |
| | - **Training script:** [train_pixel.py](https://github.com/DM4CT/DM4CT/blob/main/train_pixel.py) |
| |
|
| | --- |
| |
|
| | ## π Usage |
| |
|
| | ```python |
| | from diffusers import DDPMPipeline |
| | |
| | # Load the pipeline |
| | pipeline = DDPMPipeline.from_pretrained("jiayangshi/lodoind_pixel_diffusion") |
| | pipeline.to("cuda") |
| | |
| | # Generate a CT slice prior |
| | image = pipeline().images[0] |
| | image.save("generated_ct_slice.png") |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @inproceedings{shi2026dmct, |
| | title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction}, |
| | author={Shi, Jiayang and Pelt, Dani{\"{e}}l M and Batenburg, K Joost}, |
| | booktitle={The Fourteenth International Conference on Learning Representations}, |
| | year={2026}, |
| | url={https://openreview.net/forum?id=YE5scJekg5} |
| | } |
| | ``` |