File size: 2,568 Bytes
4a981aa
 
c21c960
 
0cddd5b
c21c960
 
 
 
 
 
4a981aa
 
0cddd5b
4a981aa
c21c960
4a981aa
c21c960
 
 
4a981aa
0cddd5b
4a981aa
0cddd5b
4a981aa
0cddd5b
 
4a981aa
0cddd5b
 
 
 
 
c21c960
0cddd5b
4a981aa
0cddd5b
4a981aa
0cddd5b
4a981aa
0cddd5b
4a981aa
c21c960
4a981aa
0cddd5b
 
 
 
4a981aa
0cddd5b
4a981aa
0cddd5b
4a981aa
0cddd5b
4a981aa
c21c960
 
 
 
4a981aa
0cddd5b
4a981aa
0cddd5b
4a981aa
c21c960
 
0cddd5b
 
c21c960
 
0cddd5b
4a981aa
c21c960
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
---
library_name: diffusers
license: mit
pipeline_tag: image-to-image
tags:
- computed-tomography
- ct-reconstruction
- diffusion-model
- inverse-problems
- dm4ct
- sparse-view-ct
---

# Pixel Diffusion UNet – Real-world Synchrotron Dataset (DM4CT)

This repository contains the pretrained **pixel-space diffusion UNet** presented in the paper [DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction](https://huggingface.co/papers/2602.18589).

πŸ”— **Project Page:** [https://dm4ct.github.io/DM4CT/](https://dm4ct.github.io/DM4CT/)  
πŸ”— **Arxiv:** [https://arxiv.org/abs/2602.18589](https://arxiv.org/abs/2602.18589)  
πŸ”— **Codebase:** [https://github.com/DM4CT/DM4CT](https://github.com/DM4CT/DM4CT)  

---

## πŸ”¬ Model Overview

This model learns a **prior over CT reconstruction images** using a denoising diffusion probabilistic model (DDPM).  
It operates directly in **pixel space** (not latent space).

- **Architecture**: 2D UNet (Diffusers `UNet2DModel`)
- **Input resolution**: 768 Γ— 768
- **Channels**: 1 (grayscale CT slice)
- **Training objective**: Ξ΅-prediction (standard DDPM formulation)
- **Noise schedule**: Linear beta schedule
- **Training dataset**: Real-world Synchrotron Dataset of rocks 
- **Intensity normalization**: Rescaled to (-1, 1)

This model is intended to be combined with data-consistency correction for CT reconstruction.

---

## πŸ“Š Dataset: Real-world Synchrotron Dataset

Source: [Zenodo](https://zenodo.org/records/15420527)

Preprocessing steps:
- Train/test split
- Rescale reconstructed slices to (-1, 1)
- No geometry information is embedded in the model

The model learns an unconditional image prior over CT slices.

---

## 🧠 Training Details

- **Optimizer**: AdamW
- **Learning rate**: 1e-4
- **Hardware**: NVIDIA A100 GPU
- **Training script**: [train_pixel.py](https://github.com/DM4CT/DM4CT/blob/main/train_pixel.py)

---

## πŸš€ Usage

You can use this model with the `diffusers` library as follows:

```python
from diffusers import DDPMPipeline

# Load the pipeline
pipeline = DDPMPipeline.from_pretrained("jiayangshi/synchrotron_pixel_diffusion")

# Access the UNet model
model = pipeline.unet
model.eval()
```

---

## Citation

```bibtex
@inproceedings{
shi2026dmct,
title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction},
author={Shi, Jiayang and Pelt, Dani{\"e}l M and Batenburg, K Joost},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=YE5scJekg5}
}
```