File size: 6,027 Bytes
6baa8cc 6234cad 6baa8cc b9fa26c 6baa8cc 61b016d 6baa8cc 6b2bb5f 6baa8cc 6b2bb5f 6baa8cc 5f73389 6baa8cc 025842f 6baa8cc 025842f 6baa8cc 025842f 6baa8cc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 |
# π§ Text-Conditioned Latent Diffusion for Contrast-Enhanced CT Synthesis
**Model Name**: `mlii0117/sd1.5_MPECT`
**Model Type**: Fine-tuned `Stable Diffusion v1.5` for medical image-to-image translation
**Paper**: _Text-Conditioned Latent Diffusion Model for Synthesis of Contrast-Enhanced CT from Non-Contrast CT_
**Conference**: AAPM 2025 (Oral)
**Authors**: Mingjie Li, Yizheng Chen, Lei Xing, Michael F. Gensheimer
**Affiliation**: Department of Radiation Oncology - Medical Physics Divison, Stanford University
---
## 𧬠Model Description
This model is a fine-tuned version of **Stable Diffusion v1.5**, specialized for converting **non-contrast CT images** into **contrast-enhanced CT images**, guided by **textual phase prompts** (e.g., *venous phase*, *arterial phase*). It utilizes the `InstructPix2Pix` framework to enable flexible prompt-conditioned generation, enabling control over contrast timing without requiring explicit paired data.
---
## π‘ Key Features
- π§Ύ **Text-guided control** over contrast phase (arterial vs. venous)
- πΌοΈ Processes **2D CT slices** in image format (converted from DICOM)
- π₯ Focused on **clinical realism and anatomical fidelity**
- π§ Reconstructs full 3D volume with NIfTI output support
- β
Evaluated and presented as **Oral at AAPM 2025**
---
## π οΈ Usage
### π§ Requirements
```bash
pip install diffusers==0.25.0 nibabel pydicom tqdm pillow
```
### π¦ Load the Model
```python
from diffusers import StableDiffusionInstructPix2PixPipeline
import torch
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(
"mlii0117/sd1.5_MPECT", torch_dtype=torch.float16
).to("cuda")
generator = torch.Generator("cuda").manual_seed(0)
```
### π Example Prompts
- **Arterial Phase**
```
Convert this non-contrast CT slice to mimic an arterial-phase contrast-enhanced CT.
Brighten and enhance the aorta, major arteries, and adjacent organ boundaries to emphasize arterial flow,
focusing on clarity and contrast in these areas while maintaining other features unchanged.
```
- **Venous Phase**
```
Convert this non-contrast CT slice to mimic a venous-phase contrast-enhanced CT.
Brighten and enhance the veins, especially the portal and hepatic veins,
and emphasize organ boundaries to mimic venous flow, focusing on brightness and contrast in these areas while maintaining other features unchanged."
```
### π§ͺ Full Pipeline Example
```python
import os
import numpy as np
import nibabel as nib
from PIL import Image
from glob import glob
from tqdm import tqdm
from pydicom import dcmread
from diffusers import StableDiffusionInstructPix2PixPipeline
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(
"mlii0117/sd1.5_MPECT", torch_dtype=torch.float16
).to("cuda")
generator = torch.Generator("cuda").manual_seed(0)
prompt_art = "Convert this non-contrast CT slice to mimic an arterial-phase contrast-enhanced CT. Brighten and enhance the aorta, major arteries, and adjacent organ boundaries to emphasize arterial flow, focusing on clarity and contrast in these areas while maintaining other features unchanged."
prompt_ven = "Convert this non-contrast CT slice to mimic a venous-phase contrast-enhanced CT. Brighten and enhance the veins, especially the portal and hepatic veins, and emphasize organ boundaries to mimic venous flow, focusing on brightness and contrast in these areas while maintaining other features unchanged."
# read all dicoms
def load_dicom_folder(dicom_folder):
dicom_folder = os.path.join(dicom_folder, 'DICOM')
dicom_files = sorted(glob(os.path.join(dicom_folder, "*")))
slices = []
for dicom_file in dicom_files:
ds = dcmread(dicom_file)
slices.append(ds.pixel_array.astype(np.float32))
dicom_array = np.stack(slices, axis=0)
dicom_array += ds.RescaleIntercept
dicom_array = np.clip(dicom_array, -1000, 1000)
dicom_array = (dicom_array + 1000) / 2000.0
return dicom_array
# transfer to RGB and send to diffusion
def process_slices(dicom_array):
outputs = []
for i in tqdm(range(dicom_array.shape[0])):
slice_img = (dicom_array[i] * 255).astype(np.uint8)
pil_img = Image.fromarray(slice_img).convert("RGB")
edited_image = pipe(
prompt, #### chose prompt_art or prompt_ven
image=pil_img,
num_inference_steps=20,
image_guidance_scale=1.5,
guidance_scale=10,
generator=generator,
).images[0]
gray = edited_image.convert("L")
gray_np = np.array(gray).astype(np.float32) / 255.0
gray_np = gray_np * 2000 - 1000 # scale back to [-1000, 1000]
outputs.append(gray_np)
volume = np.stack(outputs, axis=0)
return volume
# save to nii.gz
def save_nifti(volume, output_path):
affine = np.eye(4)
nii = nib.Nifti1Image(volume, affine)
nib.save(nii, output_path)
# main function
def main(input_dicom_path, output_nifti_path):
dicom_array = load_dicom_folder(input_dicom_path)
edited_volume = process_slices(dicom_array)
save_nifti(edited_volume, output_nifti_path)
# DMEO
# main("/path/to/dicom_folder", "/path/to/output.nii.gz")
```
---
## π§ Intended Use
- Medical research and simulation
- Data augmentation for contrast-enhanced imaging
- Exploratory analysis in non-contrast β contrast CT enhancement
> β οΈ **Disclaimer**: This model is for research purposes only. It is not intended for clinical decision-making or diagnostic use.
---
## π Citation
```
@inproceedings{li2025text,
title={Text-Conditioned Latent Diffusion Model for Synthesis of Contrast-Enhanced CT from Non-Contrast CT},
author={Li, Mingjie and Chen, Yizheng and Xing, Lei and Gensheimer, Michael},
booktitle={AAPM Annual Meeting (Oral)},
year={2025}
}
```
---
## π§Ύ License
This model is released for **non-commercial research purposes only**. Please contact the authors if you wish to use it in clinical or commercial settings. |