NoMAISI / README.md
ft42's picture
Update README.md
6c62721 verified
---
license: cc-by-nc-nd-3.0
base_model:
- MONAI/maisi_ct_generative
pipeline_tag: image-to-image
tags:
- medical-imaging
- 3d-synthesis
- diffusion-models
- controlnet
- monai
- pytorch
- rectified-flow
- ct-scans
- mri-imaging
- healthcare-ai
datasets:
- medical-images
- ct-scans
- anatomical-masks
---
# NoMAISI: Nodule-Oriented Medical AI for Synthetic Imaging
<div align="center">
<p align="center">
<img src="NoMAISI_logo.png" alt="PiNS Logo" width="500">
</p>
**Nodule-Oriented Medical AI for Synthetic Imaging and Augmentation in Chest CT**
[![License: CC BY-NC 4.0](https://img.shields.io/badge/License-CC%20BY--NC%204.0-lightgrey.svg)](https://creativecommons.org/licenses/by-nc/4.0/)
[![Docker](https://img.shields.io/badge/Docker-ft42%2Fpins%3Alatest-2496ED?logo=docker)](https://hub.docker.com/r/ft42/pins)
[![Python](https://img.shields.io/badge/Python-3.9+-green.svg)](https://python.org)
[![Medical Imaging](https://img.shields.io/badge/Medical-Imaging-red.svg)](https://simpleitk.org)
[![PyTorch](https://img.shields.io/badge/PyTorch-2.8.0-orange.svg)](https://pytorch.org)
[![MONAI](https://img.shields.io/badge/MONAI-1.4.0-blue.svg)](https://monai.io)
[![PiNS](https://img.shields.io/badge/PiNS-1.0.0-blue.svg)](https://github.com/fitushar/PiNS)
[![CaNA](https://img.shields.io/badge/CaNA-1.0.0-cyan.svg)](https://github.com/fitushar/CaNA)
</div>
# Abstract
Medical imaging datasets are increasingly available, yet abnormal and annotation-intensive cases such as lung nodules remain underrepresented. We Introduced NoMAISI (Nodule-Oriented Medical AI for Synthetic Imaging), a generative framework built on foundational backbones with flow-based diffusion and ControlNet conditioning. Using NoMAISI, we curated a large multi-cohort lung nodule dataset and applied context-aware nodule volume augmentation, including relocation, shrinkage to simulate early-stage disease, and expansion to model progression. Each case was rendered into multiple synthetic variants, producing a diverse and anatomically consistent dataset. Fidelity was evaluated with cross-cohort similarity metrics, and downstream integration into lung nodule detection, and classification tasks demonstrated improved external test performance, particularly in underrepresented lesion categories. These results show that nodule-oriented synthetic imaging and curated augmentation can complement clinical data, reduce annotation demands, and expand the availability of training resources for healthcare AI.
## 🧩 Workflow Overview
The overall pipeline for organ, body, and nodule segmentation with alignment is shown below:
<p align="center">
<img src="doc/images/workflow.png" alt="Segmentation Pipeline"/>
</p>
**Workflow** for constructing the **NoMAISI** development dataset. The pipeline includes **(1)** organ segmentation using AI models, **(2)** body segmentation with algorithmic methods, **(3)** nodule segmentation through AI-assisted and ML-based refinement, and **(4)** segmentation alignment to integrate organs, body, and nodules segmentations into anatomically consistent volumes.
<p align="center">
<img src="doc/images/NoMAISI_train_and_infer.png" alt="NoMAISI_train_and_infer"/>
</p>
**Overview** of our flow-based latent diffusion model with ControlNet conditioning for AI-based CT generation. The pipeline consists of three stages: **(top) Pretrained VAE** for image compression, where CT images are encoded into latent features using a frozen VAE; **(middle)** Model fine-tuning, where a **Rectified Flow ODE sampler**, conditioned on segmentation masks and voxel spacing through a **fine-tuned ControlNet**, predicts velocity fields in latent space and is optimized with a region-specific contrastive loss emphasizing ROI sensitivity and background consistency; and **(bottom) Inference**, where segmentation masks and voxel spacing guide latent sampling along the ODE trajectory to obtain a clean latent representation, which is then decoded by the VAE into full-resolution AI-generated CT images conditioned by body and lesion masks.
## πŸ“Š Dataset Composition
The table below summarizes the datasets included in this project, with their split sizes (Patients, CT scans, and Nodules) and the annotation types available.
| Dataset | Patients <br>n (%) | CT Scans <br>n (%) | Nodules <br>n (%) | Organ Seg | Nodule Seg | Nodule CCC | Nodule Box |
|------------------|---------------------|---------------------|-------------------|-----------|------------|------------|------------|
| **LNDbv4** | 223 (3.17) | 223 (2.52) | 1132 (7.84) | βœ— | βœ“ | βœ— | βœ“ |
| **NSCLC-R** | 415 (5.89) | 415 (4.69) | 415 (2.87) | βœ— | βœ“ | βœ— | βœ“ |
| **LIDC-IDRI** | 870 (12.35) | 870 (9.84) | 2584 (17.89) | βœ— | βœ“ | βœ“ | βœ“ |
| **DLCS-24** | 1605 (22.79) | 1605 (18.15) | 2478 (17.16) | βœ— | βœ“ | βœ— | βœ“ |
| **Intgmultiomics** | 1936 (27.49) | 1936 (21.90) | 1936 (13.40) | βœ— | βœ“ | βœ— | βœ— |
| **LUNA-25** | 1993 (28.30) | 3792 (42.89) | 5899 (40.84) | βœ— | βœ“ | βœ— | βœ“ |
| **TOTAL** | 7042 (100) | 8841 (100) | 14444 (100) | β€” | β€” | β€” | β€” |
---
**Notes**
- Percentages indicate proportion relative to the total for each column.
- βœ”οΈŽ = annotation available, βœ— = annotation not available.
- β€œNodule CCC” = nodule center coordinates.
- β€œNodule Box” = bounding-box annotations.
### πŸ“š Dataset citations References
* LNDbv4 : [https://zenodo.org/records/8348419](https://zenodo.org/records/8348419)
* NSCLC-Radiomics : [https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/](https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/)
* LIDC-IDRI: [https://ieee-dataport.org/documents/lung-image-database-consortium-image-collection-lidc-idri](https://ieee-dataport.org/documents/lung-image-database-consortium-image-collection-lidc-idri)
* DLCS24: [https://zenodo.org/records/13799069](https://zenodo.org/records/13799069)
* Intgmultiomics: [M Zhao et. al, Nat.Commun(2025).](https://www.nature.com/articles/s41467-024-55594-z#citeas)
* LUNA25: [https://luna25.grand-challenge.org/](https://luna25.grand-challenge.org/)
# AI-Generated CT Evaluations
### πŸ“‰ FrΓ©chet Inception Distance (FID) Results
FrΓ©chet Inception Distance (FID) of the **MAISI-v2** baseline and **NoMAISI** models with multiple public clinical datasets (test dataset) as the references (Lower is better).
| **FID (Avg.)** | **LNDbv4** | **NSCLC-R** | **LIDC-IDRI** | **DLCS-24** | **Intgmultiomics** | **LUNA-25** |
|-------------------|------------|-------------|---------------|-------------|--------------------|-------------|
| **Real** LNDbv4 | β€” | 5.13 | 1.49 | 1.05 | 2.40 | 1.98 |
| **Real** NSCLC-R | 5.13 | β€” | 3.12 | 3.66 | 1.56 | 2.65 |
| **Real** LIDC-IDRI | 1.49 | 3.12 | β€” | 0.79 | 1.44 | 0.75 |
| **Real** DLCS-24 | 1.05 | 3.66 | 0.79 | β€” | 1.56 | 1.00 |
| **Real** Intgmultiomics| 2.40 | 1.56 | 1.44 | 1.56 | β€” | 1.57 |
| **Real** LUNA-25 | 1.98 | 2.65 | 0.75 | 1.00 | 1.57 | β€” |
| **AI-Generated** MAISI-V2 | 3.15 | 5.21 | 2.70 | 2.32 | 2.82 | 1.69 |
| **AI-Generated** NoMAISI (ours) | 2.99 | 3.05 | 2.31 | 2.27 | 2.62 | 1.18 |
### πŸ“‰ FID Parity Plot
<p align="left">
<img src="doc/images/GanAI_fid_scatter_marker_legend.png" alt="Parity comparison of FID for real↔real vs AI-generated CT across datasets" width="500">
</p>
**Comparison of FrΓ©chet Inception Distance (FID) between real↔real and AI-generated CT datasets.** Each point represents a clinical dataset (**LNDbv4, NSCLC-R, LIDC-IDRI, DLCS24, Intgmultiomics, LUNA25**) under different generative models (**MAISI-V2, NoMAISI**).The x-axis shows the **median FID** computed between real datasets, while the y-axis shows the **FID of AI-generated data** compared to real.
The dashed diagonal line denotes **parity (y = x)**, where AI-generated fidelity would match real↔real fidelity.
### πŸ–ΌοΈ Example Results
**Comparison of CT generation from anatomical masks.**
- **Left:** Input organ/body segmentation mask.
- **Middle:** Generated CT slice using **MAISI-V2**.
- **Right:** Generated CT slice using **NoMAISI (ours)**.
- **Yellow boxes** highlight lung nodule regions for comparison.
<p align="center">
<img src="doc/images/DLCS_1419_ann0_slice134_triple.png" alt="Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks" width="1000">
</p>
<p align="center">
<img src="doc/images/DLCS_1508_ann0_slice46_triple.png" alt="Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks" width="1000">
</p>
<p align="center">
<img src="doc/images/DLCS_1453_ann0_slice204_triple.png" alt="Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks" width="1000">
</p>
# Inference Guide
1. [Project Structure](###project-structure)
2. [Configuration Files](###configuration-files)
### Model Weights
Model weights are available upon request. Please email the authors: <tushar.ece@duke.edu>.
### πŸ“ Project Structure
```
NoMAISI/
β”œβ”€β”€ configs/ # Configuration files
β”‚ β”œβ”€β”€ config_maisi3d-rflow.json # Main model configuration
β”‚ β”œβ”€β”€ infr_env_NoMAISI_DLCSD24_demo.json # Environment settings
β”‚ └── infr_config_NoMAISI_controlnet.json # ControlNet inference config
β”œβ”€β”€ scripts/ # Python inference scripts
β”‚ β”œβ”€β”€ infer_testV2_controlnet.py # Main inference script
β”‚ β”œβ”€β”€ infer_controlnet.py # ControlNet inference
β”‚ └── utils.py # Utility functions
β”œβ”€β”€ models/ # Pre-trained model weights
β”œβ”€β”€ data/ # Input data directory
β”œβ”€β”€ outputs/ # Generated results
β”œβ”€β”€ logs/ # Execution logs
└── inference.sub # SLURM job script
```
## βš™οΈ Configuration Files
#### 1. Main Model Configuration (`config_maisi3d-rflow.json`): Controls the core diffusion model parameters:
- Model architecture settings; Sampling parameters; Image dimensions and spacing
#### 2. Environment Configuration (`infr_env_NoMAISI_DLCSD24_demo.json`): Defines runtime environment
- Data paths and directories; GPU settings; Memory allocation
#### 3. ControlNet Configuration (`infr_config_NoMAISI_controlnet.json`): ControlNet-specific settings
- Conditioning parameters; Generation controls; Output specifications
## πŸš€ Running Inference
```bash
cd /path/NoMAISI/
# Create logs directory if it doesn't exist
mkdir -p logs
# Submit job to SLURM
sbatch inference.sub
```
```bash
# Run inference directly
cd /path/NoMAISI/
python -m scripts.infer_testV2_controlnet \
-c ./configs/config_maisi3d-rflow.json \
-e ./configs/infr_env_NoMAISI_DLCSD24_demo.json \
-t ./configs/infr_config_NoMAISI_controlnet.json
```
# Downstream Task:
* **Cancer vs. No-Cancer Classification**
* **Nodule Detection** ![Coming Soon](https://img.shields.io/badge/Status-Coming%20Soon-orange).
* **Nodule Segmentation** ![Coming Soon](https://img.shields.io/badge/Status-Coming%20Soon-orange).
---
---
## πŸ”¬ Downstream Task: Cancer vs. No-Cancer Classification
![Cancer/No-Cancer Classification Results](doc/images/TaskCls.png)
**Shown.** AUC vs. the **% of clinical data retained** (x-axis: **100%**, **50%**, **20%**, **10%**).
**Curves (additive augmentation β€” we **add** AI-generated nodules; we never replace clinical samples):**
- **Clinical (LUNA25)** β€” baseline using only the retained clinical data.
- **Clinical + AI-gen. (n%)** β€” at each point, add AI-generated data equal to the **same percentage as the retained clinical fraction**.
*Examples:* at **50% clinical β†’ +50% AI-gen**; **20% β†’ +20%**; **10% β†’ +10%**.
- **Clinical + AI-gen. (100%)** β€” at each point, add AI-generated data equal to **100% of the full clinical dataset size**, regardless of the retained fraction.
*Example:* at **10% clinical β†’ +100% AI-gen**.
**Takeaways**
- **AI-generated nodules improve data-efficiency:** at **low clinical fractions (50%β†’10%)**, *Clinical + AI-gen. (n%)* typically **matches or exceeds** clinical-only AUC.
- **Bigger synthetic boosts (100%)** can help in some regimes but may underperform the matched *n%* mix depending on cohort β†’ **ratio-balanced augmentation** is often safer.
- Trends **generalize to external cohorts**, indicating **usability** beyond the development data.
---
# Acknowledgements
We gratefully acknowledge the open-source projects that directly informed this repository: the [MAISI tutorial](https://github.com/Project-MONAI/tutorials/tree/main/generation/maisi) from the Project MONAI tutorials, the broader [Project MONAI ecosystem](https://github.com/Project-MONAI),
our related benchmark repo [AI in Lung Health – Benchmarking](https://github.com/fitushar/AI-in-Lung-Health-Benchmarking-Detection-and-Diagnostic-Models-Across-Multiple-CT-Scan-Datasets),
and our companion toolkits [PiNS – Point-driven Nodule Segmentation](https://github.com/fitushar/PiNS)
and [CaNA – Context-Aware Nodule Augmentation](https://github.com/fitushar/CaNA). We thank these communities and contributors for their exceptional open-source efforts. If you use our models or code, please also consider citing these works (alongside this repository) to acknowledge their contributions.
# References
* [1] [MAISI-V2; Guo, Pengfei, et al.(2025)](https://arxiv.org/abs/2508.05772)
* [2] [AI in Lung Health- Benchmarking; Tushar et al.(2024)](https://arxiv.org/abs/2405.04605)
* [3] [SYN-LUNGS; Tushar et al.(2025)](https://arxiv.org/abs/2502.21187)