language:
- en
tags:
- 3d
- medical
- image-generation
- diffusion-model
pipeline_tag: image-to-3d
arxiv: 2412.13059
license: mit
3D MedDiffusion: A 3D Medical Latent Diffusion Model for Controllable and High-quality Medical Image Generation
This is the official model repository of the paper "3D MedDiffusion: A 3D Medical Latent Diffusion Model for Controllable and High-quality Medical Image Generation".
3D MedDiffusion is a 3D medical image synthesis framework capable of generating high-quality medical images across multiple modalities and organs. It incorporates a novel, highly efficient Patch-Volume Autoencoder for latent space compression and a new noise estimator to capture both local details and global structural information during diffusion denoising. This enables the generation of fine-detailed, high-resolution images (up to 512x512x512) and ensures strong generalizability across tasks like sparse-view CT reconstruction, fast MRI reconstruction, and data augmentation.
For more information, please refer to our:
Installation
## Clone this repo
git clone https://github.com/ShanghaiTech-IMPACT/3D-MedDiffusion.git
# Setup the environment
conda create -n 3DMedDiffusion python=3.11.11
conda activate 3DMedDiffusion
pip install -r requirements.txt
Pretrained Models
The pretrained checkpoint is provided here.
Please download the checkpoints and put it to ./checkpoints.
Inference
Make sure your GPU has at least 40 GB of memory available to run inference at all supported resolutions.
Generation using 8x downsampling
python evaluation/class_conditional_generation.py --AE-ckpt checkpoints/PatchVolume_8x_s2.ckpt --model-ckpt checkpoints/BiFlowNet_0453500.pt --output-dir input/your/save/dir
Generation using 4x downsampling
python evaluation/class_conditional_generation_4x.py --AE-ckpt checkpoints/PatchVolume_4x_s2.ckpt --model-ckpt checkpoints/BiFlowNet_4x.pt --output-dir input/your/save/dir