maazshahbaz's picture
Add Mean Validation Dice 0.7161 + validation_summary.json (per-case results from checkpoint_best)
150ec31 verified
---
license: cc-by-sa-4.0
tags:
- nnunet
- nnunetv2
- medical-imaging
- segmentation
- 3d-segmentation
- ct
- lung
- lung-cancer
- tumor-segmentation
library_name: nnunetv2
pipeline_tag: image-segmentation
datasets:
- MSD-Task06-Lung
language:
- en
---
# CLN-Segmenter β€” MSD Task06 Lung Tumor Segmentation (fold 0)
A 3D U-Net (nnU-Net v2 `3d_fullres`) trained on the **Medical Segmentation Decathlon Task06: Lung Tumor** dataset, fold 0 of 5-fold cross-validation. Released as part of the CLN-Segmenter project at the Rasool Lab, Moffitt Cancer Center.
This is a single-fold pretrain checkpoint, intended as a starting point for downstream lung-lesion segmentation work β€” not a clinical-grade tool.
## Quick stats
| | |
|--|--|
| **Architecture** | nnU-Net v2 `3d_fullres` (PlainConvUNet, 6 stages, features `[32, 64, 128, 256, 320, 320]`) |
| **Training data** | MSD Task06 Lung β€” 63 cases (50 train / 13 val for fold 0) |
| **Loss** | Dice + Cross-Entropy (nnU-Net default), `batch_dice=True` |
| **Schedule** | 1000 epochs, polynomial LR decay 0.01 β†’ 0, batch size 2, patch `[80, 192, 160]` |
| **Hardware** | 1Γ— NVIDIA H100 80GB, ~6h wall-time |
| **Mean Validation Dice** (per-case, sliding-window) | **0.7161** |
| **Best EMA Pseudo Dice** (in-training proxy) | 0.8155 (epoch ~755) |
| **Foreground IoU** (per-case avg) | ~0.59 (from `validation_summary.json`) |
| **Comparison** | Within published nnU-Net Task06 range (0.69–0.78 across various reports) |
## Files in this repo
| File | Role |
|------|------|
| `checkpoint_best.pth` | Model weights β€” saved at the EMA Pseudo Dice peak (~epoch 755), *before* the late-epoch overfitting plateau |
| `nnUNetPlans.json` | Architecture spec + preprocessing plans. **Required** for inference. |
| `dataset.json` | Channel names, label names, file ending (nnU-Net v2 schema). **Required** for inference. |
| `dataset_fingerprint.json` | HU intensity stats from training data |
| `splits_final.json` | Train/val case ID splits for fold 0 (reproducibility) |
| `progress.png` | Training curves: loss, Pseudo Dice, epoch duration, learning rate |
## Training data and provenance
This model was trained **only on the publicly available MSD Task06 Lung dataset** (Antonelli et al. 2022, *Nature Communications*, CC-BY-SA 4.0). It contains expert pixel-level lung tumor annotations from 63 diagnostic CT scans.
**No patient-identifiable or institutional data was used.** This checkpoint contains no information derived from any non-public source.
## Intended use
- **Pretrained starting point** for finetuning on related lung-lesion segmentation tasks (smaller datasets, domain shift, etc.)
- **Reference baseline** for published Task06 numbers
- **Input to ensembling** with other folds (when 5-fold runs are available)
## How NOT to use it
- ❌ Not validated for clinical diagnosis or treatment decisions
- ❌ Not validated on low-dose screening CT (LDCT) β€” see Limitations
- ❌ Single fold, not an ensemble β€” paper-grade results require all 5 folds
- ❌ Not validated outside the MSD Task06 case distribution
## How to use
### 1. Download the checkpoint and metadata
```python
from huggingface_hub import snapshot_download
local_dir = snapshot_download(repo_id="Lab-Rasool/CLN-Segmenter-MSD-fold0")
print("Files at:", local_dir)
```
### 2. Set up an nnU-Net inference directory
nnU-Net expects a specific directory structure for results:
```
nnUNet_results/
└── Dataset502_MSDLung/
└── nnUNetTrainer__nnUNetPlans__3d_fullres/
β”œβ”€β”€ dataset.json
β”œβ”€β”€ plans.json (rename from nnUNetPlans.json)
β”œβ”€β”€ dataset_fingerprint.json
└── fold_0/
β”œβ”€β”€ checkpoint_best.pth
└── splits_final.json
```
You can build this with:
```bash
DST=/path/to/nnUNet_results/Dataset502_MSDLung/nnUNetTrainer__nnUNetPlans__3d_fullres
mkdir -p $DST/fold_0
cp $local_dir/dataset.json $DST/dataset.json
cp $local_dir/nnUNetPlans.json $DST/plans.json
cp $local_dir/dataset_fingerprint.json $DST/dataset_fingerprint.json
cp $local_dir/checkpoint_best.pth $DST/fold_0/checkpoint_best.pth
cp $local_dir/splits_final.json $DST/fold_0/splits_final.json
```
### 3. Run inference with nnU-Net
```bash
export nnUNet_results=/path/to/nnUNet_results
nnUNetv2_predict \
-i /path/to/your/input_images \
-o /path/to/output_predictions \
-d 502 \
-c 3d_fullres \
-tr nnUNetTrainer \
-p nnUNetPlans \
-f 0 \
-chk checkpoint_best.pth
```
Input images should be CT volumes named with the nnU-Net channel suffix: `<case_id>_0000.nii.gz`.
## Training procedure
- **Framework**: nnU-Net v2.7.0 (default trainer)
- **Preprocessing**: CT-specific normalization (HU clipping at the 0.5/99.5 percentiles of foreground voxels, then per-case z-score), resampling to target spacing `[1.245, 0.785, 0.785]` mm
- **Augmentation**: nnU-Net's default 3D augmentation pipeline (rotation, scaling, gamma, mirroring, gaussian noise/blur, low-resolution simulation)
- **Optimization**: SGD + Nesterov momentum (Ξ²=0.99), polynomial LR decay (initial LR 0.01)
- **Iterations**: fixed 250 per epoch (nnU-Net default; independent of dataset size)
- **Best-checkpoint mechanism**: nnU-Net automatically tracks EMA of validation Pseudo Dice and saves `checkpoint_best.pth` at the peak
## Evaluation
Two complementary Dice metrics, both honest, computed on the 13 fold-0 validation cases:
| Metric | Value | What it measures |
|--------|-------|------------------|
| **Mean Validation Dice** (per-case, sliding-window) | **0.7161** | Per-case Dice from full-volume `nnUNetv2_predict` inference on each of the 13 val cases, averaged. **Case-weighted** β€” every scan counts equally regardless of tumor size. *This is the metric most papers report.* |
| **Best EMA Pseudo Dice** (in-training) | 0.8155 | Voxel-pooled Dice across validation patches during training. **Voxel-weighted** β€” large tumors dominate. Used by nnU-Net to select `checkpoint_best.pth`. |
| Pseudo Dice raw (jagged) range | 0.50–0.85 | (peak per-epoch readings during training) |
| Final-epoch train loss | -0.85 | Mild late-stage overfitting visible in `progress.png`. |
| Final-epoch val loss | -0.75 | `checkpoint_best.pth` predates this. |
The 0.10 gap between Pseudo Dice (0.8155) and Mean Validation Dice (0.7161) is **smaller than for varied-lesion-size datasets** like NLSTseg or Dataset500 (~0.15 gap there). MSD Task06's tumors are uniformly large (median volume 5.22 cmΒ³), so voxel-pooled and per-case Dice are reasonably close. The smaller a dataset's lesions and the wider the size distribution, the bigger the Pseudo–Mean gap.
The training plot (`progress.png`) shows a smooth Pseudo Dice climb from 0 β†’ 0.7 in the first ~50 epochs and slow refinement to 0.81 by epoch ~750, then mild overfitting (train loss continues to drop, val loss plateaus). nnU-Net's best-checkpoint mechanism preserves the pre-overfit weights β€” that's the model in this repo.
For comparisons against other methods, **cite the Mean Validation Dice (0.7161)**. Pseudo Dice is useful as an in-training monitoring signal but not for cross-method comparison.
Per-case validation results are available in `validation_summary.json` (Dice, IoU, TP/FP/FN counts per case).
## Limitations
- **Single fold of 5-fold CV** β€” not an ensemble. Published-grade numbers require all 5 folds either averaged or ensembled at inference.
- **Trained on diagnostic CT only** β€” performance on low-dose screening CT (LDCT) is unknown and likely lower without finetuning.
- **Small training set** β€” 50 cases. The model showed mild late-stage overfitting consistent with this scale; the best-checkpoint is from before that point but generalization is bounded by data size.
- **MSD Task06 case distribution** β€” annotations focus on primary lung tumors (median volume ~5.2 cmΒ³). Performance on small nodules (e.g. <5mm) or non-tumor lung lesions is not characterized.
- **No clinical validation** β€” this is a research artifact, not a medical device.
## License
**CC-BY-SA 4.0**, inherited from the share-alike clause of the MSD Task06 source dataset license.
## Citation
If you use this model, please cite:
```bibtex
@article{isensee2021nnunet,
title = {nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation},
author = {Isensee, Fabian and Jaeger, Paul F and Kohl, Simon A A and Petersen, Jens and Maier-Hein, Klaus H},
journal = {Nature Methods},
volume = {18},
number = {2},
pages = {203--211},
year = {2021}
}
@article{antonelli2022medical,
title = {The Medical Segmentation Decathlon},
author = {Antonelli, Michela and Reinke, Annika and Bakas, Spyridon and others},
journal = {Nature Communications},
volume = {13},
number = {1},
pages = {4128},
year = {2022}
}
```
## Project context
Part of **CLN-Segmenter** at the Rasool Lab, Moffitt Cancer Center: a two-stage approach for lung lesion segmentation that pretrains on public datasets (this is one component) and finetunes on internal data with domain-specific loss formulations.
- **Code**: https://github.com/lab-rasool/CLN-Segmenter
- **Lab**: https://huggingface.co/Lab-Rasool
Other models in this series:
- `Lab-Rasool/CLN-Segmenter-NLSTseg-fold0` β€” single-dataset NLSTseg POC (LDCT, 605 expert cases)
- `Lab-Rasool/CLN-Segmenter-Dataset500-fold0` β€” unified MSD + NLSTseg pretrain (planned)