File size: 14,101 Bytes
c196078 6c62721 c196078 6c62721 c196078 6c62721 c196078 6c62721 c196078 6c62721 c196078 6c62721 c196078 6c62721 c196078 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 |
---
license: cc-by-nc-nd-3.0
base_model:
- MONAI/maisi_ct_generative
pipeline_tag: image-to-image
tags:
- medical-imaging
- 3d-synthesis
- diffusion-models
- controlnet
- monai
- pytorch
- rectified-flow
- ct-scans
- mri-imaging
- healthcare-ai
datasets:
- medical-images
- ct-scans
- anatomical-masks
---
# NoMAISI: Nodule-Oriented Medical AI for Synthetic Imaging
<div align="center">
<p align="center">
<img src="NoMAISI_logo.png" alt="PiNS Logo" width="500">
</p>
**Nodule-Oriented Medical AI for Synthetic Imaging and Augmentation in Chest CT**
[](https://creativecommons.org/licenses/by-nc/4.0/)
[](https://hub.docker.com/r/ft42/pins)
[](https://python.org)
[](https://simpleitk.org)
[](https://pytorch.org)
[](https://monai.io)
[](https://github.com/fitushar/PiNS)
[](https://github.com/fitushar/CaNA)
</div>
# Abstract
Medical imaging datasets are increasingly available, yet abnormal and annotation-intensive cases such as lung nodules remain underrepresented. We Introduced NoMAISI (Nodule-Oriented Medical AI for Synthetic Imaging), a generative framework built on foundational backbones with flow-based diffusion and ControlNet conditioning. Using NoMAISI, we curated a large multi-cohort lung nodule dataset and applied context-aware nodule volume augmentation, including relocation, shrinkage to simulate early-stage disease, and expansion to model progression. Each case was rendered into multiple synthetic variants, producing a diverse and anatomically consistent dataset. Fidelity was evaluated with cross-cohort similarity metrics, and downstream integration into lung nodule detection, and classification tasks demonstrated improved external test performance, particularly in underrepresented lesion categories. These results show that nodule-oriented synthetic imaging and curated augmentation can complement clinical data, reduce annotation demands, and expand the availability of training resources for healthcare AI.
## π§© Workflow Overview
The overall pipeline for organ, body, and nodule segmentation with alignment is shown below:
<p align="center">
<img src="doc/images/workflow.png" alt="Segmentation Pipeline"/>
</p>
**Workflow** for constructing the **NoMAISI** development dataset. The pipeline includes **(1)** organ segmentation using AI models, **(2)** body segmentation with algorithmic methods, **(3)** nodule segmentation through AI-assisted and ML-based refinement, and **(4)** segmentation alignment to integrate organs, body, and nodules segmentations into anatomically consistent volumes.
<p align="center">
<img src="doc/images/NoMAISI_train_and_infer.png" alt="NoMAISI_train_and_infer"/>
</p>
**Overview** of our flow-based latent diffusion model with ControlNet conditioning for AI-based CT generation. The pipeline consists of three stages: **(top) Pretrained VAE** for image compression, where CT images are encoded into latent features using a frozen VAE; **(middle)** Model fine-tuning, where a **Rectified Flow ODE sampler**, conditioned on segmentation masks and voxel spacing through a **fine-tuned ControlNet**, predicts velocity fields in latent space and is optimized with a region-specific contrastive loss emphasizing ROI sensitivity and background consistency; and **(bottom) Inference**, where segmentation masks and voxel spacing guide latent sampling along the ODE trajectory to obtain a clean latent representation, which is then decoded by the VAE into full-resolution AI-generated CT images conditioned by body and lesion masks.
## π Dataset Composition
The table below summarizes the datasets included in this project, with their split sizes (Patients, CT scans, and Nodules) and the annotation types available.
| Dataset | Patients <br>n (%) | CT Scans <br>n (%) | Nodules <br>n (%) | Organ Seg | Nodule Seg | Nodule CCC | Nodule Box |
|------------------|---------------------|---------------------|-------------------|-----------|------------|------------|------------|
| **LNDbv4** | 223 (3.17) | 223 (2.52) | 1132 (7.84) | β | β | β | β |
| **NSCLC-R** | 415 (5.89) | 415 (4.69) | 415 (2.87) | β | β | β | β |
| **LIDC-IDRI** | 870 (12.35) | 870 (9.84) | 2584 (17.89) | β | β | β | β |
| **DLCS-24** | 1605 (22.79) | 1605 (18.15) | 2478 (17.16) | β | β | β | β |
| **Intgmultiomics** | 1936 (27.49) | 1936 (21.90) | 1936 (13.40) | β | β | β | β |
| **LUNA-25** | 1993 (28.30) | 3792 (42.89) | 5899 (40.84) | β | β | β | β |
| **TOTAL** | 7042 (100) | 8841 (100) | 14444 (100) | β | β | β | β |
---
**Notes**
- Percentages indicate proportion relative to the total for each column.
- βοΈ = annotation available, β = annotation not available.
- βNodule CCCβ = nodule center coordinates.
- βNodule Boxβ = bounding-box annotations.
### π Dataset citations References
* LNDbv4 : [https://zenodo.org/records/8348419](https://zenodo.org/records/8348419)
* NSCLC-Radiomics : [https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/](https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/)
* LIDC-IDRI: [https://ieee-dataport.org/documents/lung-image-database-consortium-image-collection-lidc-idri](https://ieee-dataport.org/documents/lung-image-database-consortium-image-collection-lidc-idri)
* DLCS24: [https://zenodo.org/records/13799069](https://zenodo.org/records/13799069)
* Intgmultiomics: [M Zhao et. al, Nat.Commun(2025).](https://www.nature.com/articles/s41467-024-55594-z#citeas)
* LUNA25: [https://luna25.grand-challenge.org/](https://luna25.grand-challenge.org/)
# AI-Generated CT Evaluations
### π FrΓ©chet Inception Distance (FID) Results
FrΓ©chet Inception Distance (FID) of the **MAISI-v2** baseline and **NoMAISI** models with multiple public clinical datasets (test dataset) as the references (Lower is better).
| **FID (Avg.)** | **LNDbv4** | **NSCLC-R** | **LIDC-IDRI** | **DLCS-24** | **Intgmultiomics** | **LUNA-25** |
|-------------------|------------|-------------|---------------|-------------|--------------------|-------------|
| **Real** LNDbv4 | β | 5.13 | 1.49 | 1.05 | 2.40 | 1.98 |
| **Real** NSCLC-R | 5.13 | β | 3.12 | 3.66 | 1.56 | 2.65 |
| **Real** LIDC-IDRI | 1.49 | 3.12 | β | 0.79 | 1.44 | 0.75 |
| **Real** DLCS-24 | 1.05 | 3.66 | 0.79 | β | 1.56 | 1.00 |
| **Real** Intgmultiomics| 2.40 | 1.56 | 1.44 | 1.56 | β | 1.57 |
| **Real** LUNA-25 | 1.98 | 2.65 | 0.75 | 1.00 | 1.57 | β |
| **AI-Generated** MAISI-V2 | 3.15 | 5.21 | 2.70 | 2.32 | 2.82 | 1.69 |
| **AI-Generated** NoMAISI (ours) | 2.99 | 3.05 | 2.31 | 2.27 | 2.62 | 1.18 |
### π FID Parity Plot
<p align="left">
<img src="doc/images/GanAI_fid_scatter_marker_legend.png" alt="Parity comparison of FID for realβreal vs AI-generated CT across datasets" width="500">
</p>
**Comparison of FrΓ©chet Inception Distance (FID) between realβreal and AI-generated CT datasets.** Each point represents a clinical dataset (**LNDbv4, NSCLC-R, LIDC-IDRI, DLCS24, Intgmultiomics, LUNA25**) under different generative models (**MAISI-V2, NoMAISI**).The x-axis shows the **median FID** computed between real datasets, while the y-axis shows the **FID of AI-generated data** compared to real.
The dashed diagonal line denotes **parity (y = x)**, where AI-generated fidelity would match realβreal fidelity.
### πΌοΈ Example Results
**Comparison of CT generation from anatomical masks.**
- **Left:** Input organ/body segmentation mask.
- **Middle:** Generated CT slice using **MAISI-V2**.
- **Right:** Generated CT slice using **NoMAISI (ours)**.
- **Yellow boxes** highlight lung nodule regions for comparison.
<p align="center">
<img src="doc/images/DLCS_1419_ann0_slice134_triple.png" alt="Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks" width="1000">
</p>
<p align="center">
<img src="doc/images/DLCS_1508_ann0_slice46_triple.png" alt="Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks" width="1000">
</p>
<p align="center">
<img src="doc/images/DLCS_1453_ann0_slice204_triple.png" alt="Comparison of MAISI-V2 vs NoMAISI on lung CT with input masks" width="1000">
</p>
# Inference Guide
1. [Project Structure](###project-structure)
2. [Configuration Files](###configuration-files)
### Model Weights
Model weights are available upon request. Please email the authors: <tushar.ece@duke.edu>.
### π Project Structure
```
NoMAISI/
βββ configs/ # Configuration files
β βββ config_maisi3d-rflow.json # Main model configuration
β βββ infr_env_NoMAISI_DLCSD24_demo.json # Environment settings
β βββ infr_config_NoMAISI_controlnet.json # ControlNet inference config
βββ scripts/ # Python inference scripts
β βββ infer_testV2_controlnet.py # Main inference script
β βββ infer_controlnet.py # ControlNet inference
β βββ utils.py # Utility functions
βββ models/ # Pre-trained model weights
βββ data/ # Input data directory
βββ outputs/ # Generated results
βββ logs/ # Execution logs
βββ inference.sub # SLURM job script
```
## βοΈ Configuration Files
#### 1. Main Model Configuration (`config_maisi3d-rflow.json`): Controls the core diffusion model parameters:
- Model architecture settings; Sampling parameters; Image dimensions and spacing
#### 2. Environment Configuration (`infr_env_NoMAISI_DLCSD24_demo.json`): Defines runtime environment
- Data paths and directories; GPU settings; Memory allocation
#### 3. ControlNet Configuration (`infr_config_NoMAISI_controlnet.json`): ControlNet-specific settings
- Conditioning parameters; Generation controls; Output specifications
## π Running Inference
```bash
cd /path/NoMAISI/
# Create logs directory if it doesn't exist
mkdir -p logs
# Submit job to SLURM
sbatch inference.sub
```
```bash
# Run inference directly
cd /path/NoMAISI/
python -m scripts.infer_testV2_controlnet \
-c ./configs/config_maisi3d-rflow.json \
-e ./configs/infr_env_NoMAISI_DLCSD24_demo.json \
-t ./configs/infr_config_NoMAISI_controlnet.json
```
# Downstream Task:
* **Cancer vs. No-Cancer Classification**
* **Nodule Detection** .
* **Nodule Segmentation** .
---
---
## π¬ Downstream Task: Cancer vs. No-Cancer Classification

**Shown.** AUC vs. the **% of clinical data retained** (x-axis: **100%**, **50%**, **20%**, **10%**).
**Curves (additive augmentation β we **add** AI-generated nodules; we never replace clinical samples):**
- **Clinical (LUNA25)** β baseline using only the retained clinical data.
- **Clinical + AI-gen. (n%)** β at each point, add AI-generated data equal to the **same percentage as the retained clinical fraction**.
*Examples:* at **50% clinical β +50% AI-gen**; **20% β +20%**; **10% β +10%**.
- **Clinical + AI-gen. (100%)** β at each point, add AI-generated data equal to **100% of the full clinical dataset size**, regardless of the retained fraction.
*Example:* at **10% clinical β +100% AI-gen**.
**Takeaways**
- **AI-generated nodules improve data-efficiency:** at **low clinical fractions (50%β10%)**, *Clinical + AI-gen. (n%)* typically **matches or exceeds** clinical-only AUC.
- **Bigger synthetic boosts (100%)** can help in some regimes but may underperform the matched *n%* mix depending on cohort β **ratio-balanced augmentation** is often safer.
- Trends **generalize to external cohorts**, indicating **usability** beyond the development data.
---
# Acknowledgements
We gratefully acknowledge the open-source projects that directly informed this repository: the [MAISI tutorial](https://github.com/Project-MONAI/tutorials/tree/main/generation/maisi) from the Project MONAI tutorials, the broader [Project MONAI ecosystem](https://github.com/Project-MONAI),
our related benchmark repo [AI in Lung Health β Benchmarking](https://github.com/fitushar/AI-in-Lung-Health-Benchmarking-Detection-and-Diagnostic-Models-Across-Multiple-CT-Scan-Datasets),
and our companion toolkits [PiNS β Point-driven Nodule Segmentation](https://github.com/fitushar/PiNS)
and [CaNA β Context-Aware Nodule Augmentation](https://github.com/fitushar/CaNA). We thank these communities and contributors for their exceptional open-source efforts. If you use our models or code, please also consider citing these works (alongside this repository) to acknowledge their contributions.
# References
* [1] [MAISI-V2; Guo, Pengfei, et al.(2025)](https://arxiv.org/abs/2508.05772)
* [2] [AI in Lung Health- Benchmarking; Tushar et al.(2024)](https://arxiv.org/abs/2405.04605)
* [3] [SYN-LUNGS; Tushar et al.(2025)](https://arxiv.org/abs/2502.21187) |