---
license: cc-by-nc-nd-3.0
base_model:
- MONAI/maisi_ct_generative
pipeline_tag: image-to-image
tags:
- medical-imaging
- 3d-synthesis
- diffusion-models
- controlnet
- monai
- pytorch
- rectified-flow
- ct-scans
- mri-imaging
- healthcare-ai
datasets:
- medical-images
- ct-scans
- anatomical-masks
---
# NoMAISI: Nodule-Oriented Medical AI for Synthetic Imaging
**Nodule-Oriented Medical AI for Synthetic Imaging and Augmentation in Chest CT**
[](https://creativecommons.org/licenses/by-nc/4.0/)
[](https://hub.docker.com/r/ft42/pins)
[](https://python.org)
[](https://simpleitk.org)
[](https://pytorch.org)
[](https://monai.io)
[](https://github.com/fitushar/PiNS)
[](https://github.com/fitushar/CaNA)
# Abstract
Medical imaging datasets are increasingly available, yet abnormal and annotation-intensive cases such as lung nodules remain underrepresented. We Introduced NoMAISI (Nodule-Oriented Medical AI for Synthetic Imaging), a generative framework built on foundational backbones with flow-based diffusion and ControlNet conditioning. Using NoMAISI, we curated a large multi-cohort lung nodule dataset and applied context-aware nodule volume augmentation, including relocation, shrinkage to simulate early-stage disease, and expansion to model progression. Each case was rendered into multiple synthetic variants, producing a diverse and anatomically consistent dataset. Fidelity was evaluated with cross-cohort similarity metrics, and downstream integration into lung nodule detection, and classification tasks demonstrated improved external test performance, particularly in underrepresented lesion categories. These results show that nodule-oriented synthetic imaging and curated augmentation can complement clinical data, reduce annotation demands, and expand the availability of training resources for healthcare AI.
## π§© Workflow Overview
The overall pipeline for organ, body, and nodule segmentation with alignment is shown below:
**Workflow** for constructing the **NoMAISI** development dataset. The pipeline includes **(1)** organ segmentation using AI models, **(2)** body segmentation with algorithmic methods, **(3)** nodule segmentation through AI-assisted and ML-based refinement, and **(4)** segmentation alignment to integrate organs, body, and nodules segmentations into anatomically consistent volumes.
**Overview** of our flow-based latent diffusion model with ControlNet conditioning for AI-based CT generation. The pipeline consists of three stages: **(top) Pretrained VAE** for image compression, where CT images are encoded into latent features using a frozen VAE; **(middle)** Model fine-tuning, where a **Rectified Flow ODE sampler**, conditioned on segmentation masks and voxel spacing through a **fine-tuned ControlNet**, predicts velocity fields in latent space and is optimized with a region-specific contrastive loss emphasizing ROI sensitivity and background consistency; and **(bottom) Inference**, where segmentation masks and voxel spacing guide latent sampling along the ODE trajectory to obtain a clean latent representation, which is then decoded by the VAE into full-resolution AI-generated CT images conditioned by body and lesion masks.
## π Dataset Composition
The table below summarizes the datasets included in this project, with their split sizes (Patients, CT scans, and Nodules) and the annotation types available.
| Dataset | Patients
n (%) | CT Scans
n (%) | Nodules
n (%) | Organ Seg | Nodule Seg | Nodule CCC | Nodule Box |
|------------------|---------------------|---------------------|-------------------|-----------|------------|------------|------------|
| **LNDbv4** | 223 (3.17) | 223 (2.52) | 1132 (7.84) | β | β | β | β |
| **NSCLC-R** | 415 (5.89) | 415 (4.69) | 415 (2.87) | β | β | β | β |
| **LIDC-IDRI** | 870 (12.35) | 870 (9.84) | 2584 (17.89) | β | β | β | β |
| **DLCS-24** | 1605 (22.79) | 1605 (18.15) | 2478 (17.16) | β | β | β | β |
| **Intgmultiomics** | 1936 (27.49) | 1936 (21.90) | 1936 (13.40) | β | β | β | β |
| **LUNA-25** | 1993 (28.30) | 3792 (42.89) | 5899 (40.84) | β | β | β | β |
| **TOTAL** | 7042 (100) | 8841 (100) | 14444 (100) | β | β | β | β |
---
**Notes**
- Percentages indicate proportion relative to the total for each column.
- βοΈ = annotation available, β = annotation not available.
- βNodule CCCβ = nodule center coordinates.
- βNodule Boxβ = bounding-box annotations.
### π Dataset citations References
* LNDbv4 : [https://zenodo.org/records/8348419](https://zenodo.org/records/8348419)
* NSCLC-Radiomics : [https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/](https://www.cancerimagingarchive.net/collection/nsclc-radiogenomics/)
* LIDC-IDRI: [https://ieee-dataport.org/documents/lung-image-database-consortium-image-collection-lidc-idri](https://ieee-dataport.org/documents/lung-image-database-consortium-image-collection-lidc-idri)
* DLCS24: [https://zenodo.org/records/13799069](https://zenodo.org/records/13799069)
* Intgmultiomics: [M Zhao et. al, Nat.Commun(2025).](https://www.nature.com/articles/s41467-024-55594-z#citeas)
* LUNA25: [https://luna25.grand-challenge.org/](https://luna25.grand-challenge.org/)
# AI-Generated CT Evaluations
### π FrΓ©chet Inception Distance (FID) Results
FrΓ©chet Inception Distance (FID) of the **MAISI-v2** baseline and **NoMAISI** models with multiple public clinical datasets (test dataset) as the references (Lower is better).
| **FID (Avg.)** | **LNDbv4** | **NSCLC-R** | **LIDC-IDRI** | **DLCS-24** | **Intgmultiomics** | **LUNA-25** |
|-------------------|------------|-------------|---------------|-------------|--------------------|-------------|
| **Real** LNDbv4 | β | 5.13 | 1.49 | 1.05 | 2.40 | 1.98 |
| **Real** NSCLC-R | 5.13 | β | 3.12 | 3.66 | 1.56 | 2.65 |
| **Real** LIDC-IDRI | 1.49 | 3.12 | β | 0.79 | 1.44 | 0.75 |
| **Real** DLCS-24 | 1.05 | 3.66 | 0.79 | β | 1.56 | 1.00 |
| **Real** Intgmultiomics| 2.40 | 1.56 | 1.44 | 1.56 | β | 1.57 |
| **Real** LUNA-25 | 1.98 | 2.65 | 0.75 | 1.00 | 1.57 | β |
| **AI-Generated** MAISI-V2 | 3.15 | 5.21 | 2.70 | 2.32 | 2.82 | 1.69 |
| **AI-Generated** NoMAISI (ours) | 2.99 | 3.05 | 2.31 | 2.27 | 2.62 | 1.18 |
### π FID Parity Plot
**Comparison of FrΓ©chet Inception Distance (FID) between realβreal and AI-generated CT datasets.** Each point represents a clinical dataset (**LNDbv4, NSCLC-R, LIDC-IDRI, DLCS24, Intgmultiomics, LUNA25**) under different generative models (**MAISI-V2, NoMAISI**).The x-axis shows the **median FID** computed between real datasets, while the y-axis shows the **FID of AI-generated data** compared to real.
The dashed diagonal line denotes **parity (y = x)**, where AI-generated fidelity would match realβreal fidelity.
### πΌοΈ Example Results
**Comparison of CT generation from anatomical masks.**
- **Left:** Input organ/body segmentation mask.
- **Middle:** Generated CT slice using **MAISI-V2**.
- **Right:** Generated CT slice using **NoMAISI (ours)**.
- **Yellow boxes** highlight lung nodule regions for comparison.
# Inference Guide
1. [Project Structure](###project-structure)
2. [Configuration Files](###configuration-files)
### Model Weights
Model weights are available upon request. Please email the authors: .
### π Project Structure
```
NoMAISI/
βββ configs/ # Configuration files
β βββ config_maisi3d-rflow.json # Main model configuration
β βββ infr_env_NoMAISI_DLCSD24_demo.json # Environment settings
β βββ infr_config_NoMAISI_controlnet.json # ControlNet inference config
βββ scripts/ # Python inference scripts
β βββ infer_testV2_controlnet.py # Main inference script
β βββ infer_controlnet.py # ControlNet inference
β βββ utils.py # Utility functions
βββ models/ # Pre-trained model weights
βββ data/ # Input data directory
βββ outputs/ # Generated results
βββ logs/ # Execution logs
βββ inference.sub # SLURM job script
```
## βοΈ Configuration Files
#### 1. Main Model Configuration (`config_maisi3d-rflow.json`): Controls the core diffusion model parameters:
- Model architecture settings; Sampling parameters; Image dimensions and spacing
#### 2. Environment Configuration (`infr_env_NoMAISI_DLCSD24_demo.json`): Defines runtime environment
- Data paths and directories; GPU settings; Memory allocation
#### 3. ControlNet Configuration (`infr_config_NoMAISI_controlnet.json`): ControlNet-specific settings
- Conditioning parameters; Generation controls; Output specifications
## π Running Inference
```bash
cd /path/NoMAISI/
# Create logs directory if it doesn't exist
mkdir -p logs
# Submit job to SLURM
sbatch inference.sub
```
```bash
# Run inference directly
cd /path/NoMAISI/
python -m scripts.infer_testV2_controlnet \
-c ./configs/config_maisi3d-rflow.json \
-e ./configs/infr_env_NoMAISI_DLCSD24_demo.json \
-t ./configs/infr_config_NoMAISI_controlnet.json
```
# Downstream Task:
* **Cancer vs. No-Cancer Classification**
* **Nodule Detection** .
* **Nodule Segmentation** .
---
---
## π¬ Downstream Task: Cancer vs. No-Cancer Classification

**Shown.** AUC vs. the **% of clinical data retained** (x-axis: **100%**, **50%**, **20%**, **10%**).
**Curves (additive augmentation β we **add** AI-generated nodules; we never replace clinical samples):**
- **Clinical (LUNA25)** β baseline using only the retained clinical data.
- **Clinical + AI-gen. (n%)** β at each point, add AI-generated data equal to the **same percentage as the retained clinical fraction**.
*Examples:* at **50% clinical β +50% AI-gen**; **20% β +20%**; **10% β +10%**.
- **Clinical + AI-gen. (100%)** β at each point, add AI-generated data equal to **100% of the full clinical dataset size**, regardless of the retained fraction.
*Example:* at **10% clinical β +100% AI-gen**.
**Takeaways**
- **AI-generated nodules improve data-efficiency:** at **low clinical fractions (50%β10%)**, *Clinical + AI-gen. (n%)* typically **matches or exceeds** clinical-only AUC.
- **Bigger synthetic boosts (100%)** can help in some regimes but may underperform the matched *n%* mix depending on cohort β **ratio-balanced augmentation** is often safer.
- Trends **generalize to external cohorts**, indicating **usability** beyond the development data.
---
# Acknowledgements
We gratefully acknowledge the open-source projects that directly informed this repository: the [MAISI tutorial](https://github.com/Project-MONAI/tutorials/tree/main/generation/maisi) from the Project MONAI tutorials, the broader [Project MONAI ecosystem](https://github.com/Project-MONAI),
our related benchmark repo [AI in Lung Health β Benchmarking](https://github.com/fitushar/AI-in-Lung-Health-Benchmarking-Detection-and-Diagnostic-Models-Across-Multiple-CT-Scan-Datasets),
and our companion toolkits [PiNS β Point-driven Nodule Segmentation](https://github.com/fitushar/PiNS)
and [CaNA β Context-Aware Nodule Augmentation](https://github.com/fitushar/CaNA). We thank these communities and contributors for their exceptional open-source efforts. If you use our models or code, please also consider citing these works (alongside this repository) to acknowledge their contributions.
# References
* [1] [MAISI-V2; Guo, Pengfei, et al.(2025)](https://arxiv.org/abs/2508.05772)
* [2] [AI in Lung Health- Benchmarking; Tushar et al.(2024)](https://arxiv.org/abs/2405.04605)
* [3] [SYN-LUNGS; Tushar et al.(2025)](https://arxiv.org/abs/2502.21187)