|
|
--- |
|
|
license: mit |
|
|
--- |
|
|
|
|
|
# Conditional Diffusion Model for Medical Image Generation |
|
|
|
|
|
This repository contains a conditional diffusion model trained to generate **3D medical CT scan images** based on segmentation masks. |
|
|
The model uses a **U-Net architecture with score-based diffusion** for high-quality medical image synthesis. |
|
|
|
|
|
--- |
|
|
|
|
|
## Real or Fake Image? |
|
|
|
|
|
<p> |
|
|
<img src="assets/real_fake.png" alt="Sample real vs fake medical CT" width="600"/> |
|
|
</p> |
|
|
|
|
|
--- |
|
|
|
|
|
## Training Dataset |
|
|
|
|
|
The model was trained on **3,346 CT scan examples** with corresponding segmentation masks (80/20 train–validation split). |
|
|
|
|
|
<p> |
|
|
<img src="assets/dataset.png" alt="Sample dataset" width="600"/> |
|
|
</p> |
|
|
|
|
|
**Sources:** |
|
|
1. [Kaggle Pancreas CT](https://www.kaggle.com/datasets/salihayesilyurt/pancreas-ct) |
|
|
2. [Cancer Imaging Archive Pancreatic CT](https://nbia.cancerimagingarchive.net/nbia-search/) |
|
|
3. [Annotated Medical Image Dataset for Segmentation Algorithms](https://drive.google.com/drive/folders/1HqEgzS8BV2c7xYNrZdEAnrHk7osJJ--2) |
|
|
|
|
|
--- |
|
|
|
|
|
## <a href="https://archietan.com/synthetic-ct-demo" style="color:blue; text-decoration:underline;">Live Interactive Demo</a> |
|
|
|
|
|
<p> |
|
|
<a href="https://archietan.com/synthetic-ct-demo"> |
|
|
<img src="assets/livedemo.png" alt="Sample input and output" width="600"/> |
|
|
</a> |
|
|
</p> |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Architecture |
|
|
|
|
|
- **Base Model**: U-Net with 5-level encoder–decoder |
|
|
- **Input**: 4-channel 256×256 CT scan images |
|
|
- **Conditioning**: Segmentation masks (4-channel 256×256) |
|
|
- **Output**: 4-channel 256×256 generated images |
|
|
- **Sampling**: Euler–Maruyama sampler (250 steps) |
|
|
- **Training**: Score matching loss with conditional generation |
|
|
|
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Training Data**: 3,346 CT scan examples |
|
|
- **Lambda Parameter**: 25.0 (diffusion coefficient) |
|
|
- **Embedding Dimension**: 256 |
|
|
- **Channels**: [32, 64, 128, 256, 512] |
|
|
- **Activation**: SiLU (Swish) |
|
|
|
|
|
--- |
|
|
|
|
|
## Usage |
|
|
|
|
|
This model can be used to **add more diversity to your CT-scan dataset**, especially when: |
|
|
- You have a **limited dataset size** (e.g., only a few hundred scans). |
|
|
- You want to **balance underrepresented anatomical variations** or rare conditions. |
|
|
- You need **synthetic augmentation** for training deep learning models in segmentation, detection, or classification. |
|
|
|
|
|
**Example Applications** |
|
|
- Generate training samples from segmentation masks to **reduce overfitting**. |
|
|
- Create synthetic CT images with controlled variations to **test robustness**. |
|
|
- Improve representation of minority cases to **reduce bias in medical AI**. |
|
|
|
|
|
### Using the Hugging Face API |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForImageGeneration |
|
|
import torch |
|
|
|
|
|
# Load the model |
|
|
model = AutoModelForImageGeneration.from_pretrained("your-username/your-model-name") |
|
|
|
|
|
# Generate images |
|
|
conditioning_mask = torch.randn(1, 4, 256, 256) # Your segmentation mask |
|
|
generated_image = model.generate(conditioning_mask) |
|
|
|
|
|
|
|
|
### Local Usage |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from model import UNet, marginal_prob_std, diffusion_coeff, Euler_Maruyama_sampler |
|
|
|
|
|
# Load model |
|
|
Lambda = 25.0 |
|
|
marginal_prob_std_fn = lambda t: marginal_prob_std(t, Lambda=Lambda, device='cuda') |
|
|
score_model = UNet(marginal_prob_std=marginal_prob_std_fn) |
|
|
score_model.load_state_dict(torch.load("ckpt_3D_v2.pth")) |
|
|
score_model.eval() |
|
|
|
|
|
# Generate sample |
|
|
conditioning_mask = torch.randn(1, 4, 256, 256) |
|
|
samples = Euler_Maruyama_sampler( |
|
|
score_model, |
|
|
marginal_prob_std_fn, |
|
|
lambda t: diffusion_coeff(t, Lambda=Lambda, device='cuda'), |
|
|
batch_size=1, |
|
|
x_shape=(4, 256, 256), |
|
|
num_steps=250, |
|
|
device='cuda', |
|
|
y=conditioning_mask |
|
|
) |
|
|
``` |
|
|
|
|
|
## Training |
|
|
|
|
|
The model was trained for 5000 epochs with: |
|
|
- Learning rate: 2e-4 (with decay) |
|
|
- Batch size: 1 |
|
|
- Optimizer: Adam |
|
|
- Loss: Score matching loss |
|
|
|
|
|
## Citation |
|
|
|
|
|
If you use this model in your research, please cite: |
|
|
|
|
|
```bibtex |
|
|
@misc{conditional_diffusion_medical, |
|
|
title={Conditional Diffusion Model for Medical Image Generation}, |
|
|
author={Archie Tan, Scott, Spurlock}, |
|
|
year={2025}, |
|
|
url={https://huggingface.co/tan200224} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Publications |
|
|
|
|
|
- **Archie Tan, Scott Spurlock.** |
|
|
*Learning to generate realistic medical images to improve pancreatic cancer segmentation.* |
|
|
Accepted for presentation at the [39th Annual Consortium for Computing Sciences in Colleges: Southeastern Conference (CCSC-SE 2025)](http://ccscse.org/), Mercer University, Macon, GA, November 7–8, 2025. |
|
|
Published in the *Journal of the Consortium for Computing Sciences in Colleges* (to appear). |
|
|
[Conference Site](https://www.conftool.org/ccsc-se/) | [Formatting Guidelines](https://lubaochuan.github.io/ccsc-editor/authors.html) |
|
|
|
|
|
## License |
|
|
|
|
|
This project is open-source under the MIT License. |
|
|
|
|
|
Copyright (c) 2025 Archie Tan, Scott Spurlock |
|
|
|
|
|
Permission is hereby granted, free of charge, to any person obtaining a copy |
|
|
of this software and associated documentation files (the "Software"), to deal |
|
|
in the Software without restriction, including without limitation the rights |
|
|
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell |
|
|
copies of the Software, and to permit persons to whom the Software is |
|
|
furnished to do so, subject to the following conditions: |
|
|
|
|
|
The above copyright notice and this permission notice shall be included in all |
|
|
copies or substantial portions of the Software. |
|
|
|
|
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR |
|
|
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, |
|
|
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE |
|
|
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER |
|
|
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, |
|
|
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE |
|
|
SOFTWARE. |
|
|
|
|
|
|
|
|
## Contact |
|
|
|
|
|
For questions or issues, please open an issue on this repository. |