Vesuvius Challenge β nnU-Net Surface Detection Model
A 3D segmentation model for detecting papyrus sheet surfaces in micro-CT scans of the Herculaneum scrolls, trained for the Vesuvius Challenge Surface Detection competition on Kaggle.
Model Description
This model uses nnU-Net v2, the self-configuring deep learning framework for biomedical image segmentation. nnU-Net automatically determines the optimal network architecture, preprocessing, and training strategy from the dataset properties.
The model segments 3D micro-CT volumes into three classes:
- Background (0) β empty space / non-papyrus
- Surface (1) β the outer surface of the papyrus sheet
- Interior (2) β the interior of the papyrus sheet
Architecture
| Component | Value |
|---|---|
| Network | PlainConvUNet (3D) |
| Stages | 6 |
| Features per stage | [32, 64, 128, 256, 320, 320] |
| Conv op | Conv3d (3Γ3Γ3 kernels) |
| Normalization | InstanceNorm3d |
| Activation | LeakyReLU (inplace) |
| Deep supervision | Yes |
| Patch size | 128 Γ 128 Γ 128 |
| Batch size | 2 |
| Parameters | ~31M |
Training Details
| Detail | Value |
|---|---|
| Dataset | Dataset011_Vesuvius (786 training volumes) |
| Epochs | 200 |
| Iterations/epoch | 250 (train), 50 (val) |
| Optimizer | SGD (lr=0.01, momentum=0.99, nesterov=True, weight_decay=3e-5) |
| LR schedule | PolyLR (200 epochs) |
| Loss | Dice + Cross-Entropy with Deep Supervision |
| Normalization | CT Normalization (dataset-level statistics) |
| Foreground oversampling | 33% |
| Mixed precision | Yes (AMP with GradScaler) |
| torch.compile | Yes |
| Hardware | NVIDIA A10G (AWS g5.xlarge) |
| Training time | ~11.5 hours |
| Framework | nnU-Net v2, PyTorch 2.10.0+cu128, CUDNN 9.1 |
Training Metrics
| Metric | Value |
|---|---|
| Best EMA pseudo Dice | 0.4676 (epoch 194) |
| Best val loss | 0.3518 (epoch 187) |
| Final train loss | 0.4127 |
| Final val loss | 0.4274 |
Kaggle Leaderboard
| Submission | Post-Processing | Public LB | Private LB |
|---|---|---|---|
| V2 pipeline (best) | 1st-place post-proc | 0.433 | 0.449 |
Note: The Kaggle submission used our V2 pipeline model (custom UNet3D), not this nnU-Net model directly. This nnU-Net model is intended for future ensemble submissions and as a standalone alternative.
Dataset
The Vesuvius Challenge Surface Detection dataset consists of 3D micro-CT scans of carbonized papyrus scrolls from Herculaneum, buried by the eruption of Mount Vesuvius in 79 AD.
- 704 training + 82 validation volumes (786 total)
- Volume shape: 320 Γ 314 Γ 314 voxels
- Voxel spacing: 7.91 ΞΌm isotropic
- File format: 3D TIFF
- Intensity range: 0β255 (CT values)
- Validation split: Scroll ID
26002
Intensity Statistics (foreground)
| Stat | Value |
|---|---|
| Mean | 87.5 |
| Median | 81.0 |
| Std | 47.7 |
| Min/Max | 0 / 255 |
| 0.5th/99.5th percentile | 0 / 212 |
How to Use
Prerequisites
pip install nnunetv2 torch
Inference with nnU-Net
from huggingface_hub import snapshot_download
# Download model
model_dir = snapshot_download("bshepp/vesuvius-nnunet-model")
# Set nnU-Net environment variables
import os
os.environ["nnUNet_results"] = model_dir
# Run prediction using nnU-Net CLI
# nnUNetv2_predict -i INPUT_FOLDER -o OUTPUT_FOLDER -d 011 -c 3d_fullres -tr nnUNetTrainer_200epochs -f all
Loading checkpoint directly
import torch
checkpoint = torch.load("checkpoint_best.pth", map_location="cpu")
# checkpoint contains: network_weights, optimizer_state, epoch, etc.
Files
| File | Description | Size |
|---|---|---|
fold_all/checkpoint_best.pth |
Best model weights (EMA pseudo Dice) | 238 MB |
fold_all/checkpoint_final.pth |
Final epoch (200) weights | 238 MB |
plans.json |
nnU-Net experiment plan (architecture, preprocessing) | 20 KB |
dataset.json |
Dataset configuration (channels, labels) | <1 KB |
dataset_fingerprint.json |
Dataset statistics | 110 KB |
fold_all/training_log_*.txt |
Full training log (200 epochs) | 80 KB |
fold_all/progress.png |
Training curves plot | 880 KB |
fold_all/debug.json |
Debug/config snapshot | β |
Post-Processing
For best results, apply the 1st-place post-processing pipeline after inference:
- Binary closing with spherical footprint (radius 2)
- Height-map patching with bilinear interpolation
- LUT-based 1-voxel hole plugging (6-connectivity)
- Global
binary_fill_holes
See 1st place writeup for details.
Citation
If you use this model, please cite nnU-Net:
@article{isensee2021nnu,
title={nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation},
author={Isensee, Fabian and Jaeger, Paul F and Kohl, Simon AA and Petersen, Jens and Maier-Hein, Klaus H},
journal={Nature methods},
volume={18},
number={2},
pages={203--211},
year={2021},
publisher={Nature Publishing Group}
}
License
MIT
Author
Brian Sheppard (@bshepp)