Add pipeline tag and improve model card
#1
by
nielsr HF Staff - opened
README.md
CHANGED
|
@@ -1,30 +1,30 @@
|
|
| 1 |
---
|
| 2 |
-
license: mit
|
| 3 |
library_name: diffusers
|
|
|
|
|
|
|
| 4 |
tags:
|
| 5 |
-
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
|
| 9 |
-
|
| 10 |
-
|
| 11 |
---
|
| 12 |
|
| 13 |
# Pixel Diffusion UNet β LoDoInd (DM4CT)
|
| 14 |
|
| 15 |
-
This repository contains the pretrained **pixel-space diffusion UNet** used in the
|
| 16 |
-
**DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026)** benchmark.
|
| 17 |
|
| 18 |
-
|
| 19 |
-
|
| 20 |
-
|
|
|
|
| 21 |
|
| 22 |
---
|
| 23 |
|
| 24 |
## π¬ Model Overview
|
| 25 |
|
| 26 |
-
This model learns a **prior over CT reconstruction images** using a denoising diffusion probabilistic model (DDPM).
|
| 27 |
-
It operates directly in **pixel space** (not latent space).
|
| 28 |
|
| 29 |
- **Architecture**: 2D UNet (Diffusers `UNet2DModel`)
|
| 30 |
- **Input resolution**: 512 Γ 512
|
|
@@ -34,14 +34,13 @@ It operates directly in **pixel space** (not latent space).
|
|
| 34 |
- **Training dataset**: Industry CT dataset (LoDoInd)
|
| 35 |
- **Intensity normalization**: Rescaled to (-1, 1)
|
| 36 |
|
| 37 |
-
This model is intended to be combined with data-consistency correction for CT reconstruction.
|
| 38 |
|
| 39 |
---
|
| 40 |
|
| 41 |
## π Dataset: LoDoInd
|
| 42 |
|
| 43 |
-
Source:
|
| 44 |
-
https://www.aapm.org/grandchallenge/lowdosect/
|
| 45 |
|
| 46 |
Preprocessing steps:
|
| 47 |
- Train/test split
|
|
@@ -54,14 +53,10 @@ The model learns an unconditional image prior over CT slices.
|
|
| 54 |
|
| 55 |
## π§ Training Details
|
| 56 |
|
| 57 |
-
- Optimizer
|
| 58 |
-
- Learning rate
|
| 59 |
-
-
|
| 60 |
-
- Training
|
| 61 |
-
- Hardware: NVIDIA A100 GPU
|
| 62 |
-
|
| 63 |
-
Training script:
|
| 64 |
-
https://github.com/DM4CT/DM4CT/blob/main/train_pixel.py
|
| 65 |
|
| 66 |
---
|
| 67 |
|
|
@@ -69,7 +64,26 @@ https://github.com/DM4CT/DM4CT/blob/main/train_pixel.py
|
|
| 69 |
|
| 70 |
```python
|
| 71 |
from diffusers import DDPMPipeline
|
|
|
|
|
|
|
| 72 |
pipeline = DDPMPipeline.from_pretrained("jiayangshi/lodoind_pixel_diffusion")
|
| 73 |
-
)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 74 |
|
| 75 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
library_name: diffusers
|
| 3 |
+
license: mit
|
| 4 |
+
pipeline_tag: image-to-image
|
| 5 |
tags:
|
| 6 |
+
- computed-tomography
|
| 7 |
+
- ct-reconstruction
|
| 8 |
+
- diffusion-model
|
| 9 |
+
- inverse-problems
|
| 10 |
+
- dm4ct
|
| 11 |
+
- sparse-view-ct
|
| 12 |
---
|
| 13 |
|
| 14 |
# Pixel Diffusion UNet β LoDoInd (DM4CT)
|
| 15 |
|
| 16 |
+
This repository contains the pretrained **pixel-space diffusion UNet** used in the benchmark study **DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026)**.
|
|
|
|
| 17 |
|
| 18 |
+
- **Paper:** [DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction](https://huggingface.co/papers/2602.18589)
|
| 19 |
+
- **ArXiv:** [https://arxiv.org/abs/2602.18589](https://arxiv.org/abs/2602.18589)
|
| 20 |
+
- **Project Page:** [https://dm4ct.github.io/DM4CT/](https://dm4ct.github.io/DM4CT/)
|
| 21 |
+
- **Codebase:** [https://github.com/DM4CT/DM4CT](https://github.com/DM4CT/DM4CT)
|
| 22 |
|
| 23 |
---
|
| 24 |
|
| 25 |
## π¬ Model Overview
|
| 26 |
|
| 27 |
+
This model learns a **prior over CT reconstruction images** using a denoising diffusion probabilistic model (DDPM). It operates directly in **pixel space** (not latent space).
|
|
|
|
| 28 |
|
| 29 |
- **Architecture**: 2D UNet (Diffusers `UNet2DModel`)
|
| 30 |
- **Input resolution**: 512 Γ 512
|
|
|
|
| 34 |
- **Training dataset**: Industry CT dataset (LoDoInd)
|
| 35 |
- **Intensity normalization**: Rescaled to (-1, 1)
|
| 36 |
|
| 37 |
+
This model is intended to be combined with data-consistency correction for CT reconstruction tasks.
|
| 38 |
|
| 39 |
---
|
| 40 |
|
| 41 |
## π Dataset: LoDoInd
|
| 42 |
|
| 43 |
+
Source: [LoDoInd on Zenodo](https://zenodo.org/records/10391412)
|
|
|
|
| 44 |
|
| 45 |
Preprocessing steps:
|
| 46 |
- Train/test split
|
|
|
|
| 53 |
|
| 54 |
## π§ Training Details
|
| 55 |
|
| 56 |
+
- **Optimizer:** AdamW
|
| 57 |
+
- **Learning rate:** 1e-4
|
| 58 |
+
- **Hardware:** NVIDIA A100 GPU
|
| 59 |
+
- **Training script:** [train_pixel.py](https://github.com/DM4CT/DM4CT/blob/main/train_pixel.py)
|
|
|
|
|
|
|
|
|
|
|
|
|
| 60 |
|
| 61 |
---
|
| 62 |
|
|
|
|
| 64 |
|
| 65 |
```python
|
| 66 |
from diffusers import DDPMPipeline
|
| 67 |
+
|
| 68 |
+
# Load the pipeline
|
| 69 |
pipeline = DDPMPipeline.from_pretrained("jiayangshi/lodoind_pixel_diffusion")
|
| 70 |
+
pipeline.to("cuda")
|
| 71 |
+
|
| 72 |
+
# Generate a CT slice prior
|
| 73 |
+
image = pipeline().images[0]
|
| 74 |
+
image.save("generated_ct_slice.png")
|
| 75 |
+
```
|
| 76 |
+
|
| 77 |
+
---
|
| 78 |
|
| 79 |
+
## Citation
|
| 80 |
+
|
| 81 |
+
```bibtex
|
| 82 |
+
@inproceedings{shi2026dmct,
|
| 83 |
+
title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction},
|
| 84 |
+
author={Shi, Jiayang and Pelt, Dani{\"{e}}l M and Batenburg, K Joost},
|
| 85 |
+
booktitle={The Fourteenth International Conference on Learning Representations},
|
| 86 |
+
year={2026},
|
| 87 |
+
url={https://openreview.net/forum?id=YE5scJekg5}
|
| 88 |
+
}
|
| 89 |
+
```
|