OpenCrack nnU-Net (pavement crack segmentation)

A supervised nnU-Net v2 (2D) model for pixel-level pavement-crack segmentation, trained on the OpenCrack consolidated benchmark. On the held-out OpenCrack test cohort it reaches 0.539 IoU and 0.641 clIoU(τ=4), leading three released crack-segmentation baselines (CrackSAM-adapter, OmniCrack30k, Hybrid-Segmentor) by confidence-interval-disjoint margins.

Benchmark and full results: https://github.com/fadeevla/OpenCrack

Intended use

In-domain: pixel segmentation of cracks in close-up pavement, concrete, masonry, facade, and bridge-deck imagery (the OpenCrack regime).
Out of scope: wide-angle road scenes with vegetation, sky, and vehicles in frame. The model does not transfer to that regime (see Limitations); use it on close-up surface imagery.
Target application is offline periodic infrastructure inspection, where there is no tight latency budget.

How to use

This is a standard nnU-Net v2 model. Install nnunetv2, place the checkpoint and the planner-generated configuration in an nnU-Net results folder, and run the predictor:

The released checkpoint is the 500-epoch one (checkpoint_ep0500.pth); the included nnUNetPlans.json and dataset.json are the planner-generated configuration.

pip install nnunetv2
# arrange files as Dataset501_OpenCrack/nnUNetTrainer__nnUNetPlans__2d/fold_0/checkpoint_ep0500.pth
# plus dataset.json and nnUNetPlans.json (included with the weights)

nnUNetv2_predict \
  -i /path/to/input_images \
  -o /path/to/predictions \
  -d 501 -c 2d -f 0 \
  -chk checkpoint_ep0500.pth

Inputs are RGB surface images; outputs are binary crack masks at input resolution. FP16 inference is verified IoU-identical to FP32. The model runs on an 8 GB consumer GPU at about 1.45 images/sec at 2048×2048, or about 21 images/sec on 600-pixel tiles.

Training data

Trained on the class-1 (crack) training split of OpenCrack, a consolidation of 32 public crack datasets (28 COCO dataset_source labels) into one COCO benchmark with a four-class taxonomy, DINOv2 cross-split deduplication (cosine τ = 0.979), and a stratified 70/15/15 split on source, crack presence, and crack-width band. The deduplication removes the train/test image leakage that inflates scores on the large composite crack datasets. See the benchmark page for the full source roster and citations.

Training procedure

Self-configuring nnU-Net v2, trained from random initialisation; the configuration is the one the nnU-Net planner derives from the dataset fingerprint:

Architecture: plain convolutional U-Net (PlainConvUNet), seven stages, feature widths [32, 64, 128, 256, 512, 512, 512], InstanceNorm + LeakyReLU. ≈92.5 M parameters (encoder ≈28.3 M, decoder ≈64.2 M; counted from the released checkpoint).
Input: 256×256 patches, per-image z-normalisation, nnU-Net default augmentation.
Optimisation: SGD with nnU-Net's default schedule; nnUNet_compile=f.
Budget: a fixed training budget of about 500 epochs, single fold (fold 0), no early stopping and no multi-fold ensembling. By contrast the released OmniCrack30k uses nnU-Net's default schedule (a five-fold ensemble at the default 1,000 epochs), so this model is the smaller training budget. The released checkpoint is the one that scores the numbers below on the OpenCrack test set.

Evaluation

All models scored on the same OpenCrack held-out stratified test cohort (6,910 crack-pixel positives), one harness, 1,000-resample bootstrap 95% CIs. Image-weighted (micro) means:

Model	IoU	Dice	Boundary F1	NSD	clIoU(τ=4)
OpenCrack nnU-Net (this model)	0.539	0.664	0.608	0.700	0.641
CrackSAM-adapter	0.470	0.586	0.523	0.631	0.606
OmniCrack30k	0.426	0.532	0.504	0.587	0.557
Hybrid-Segmentor	0.402	0.493	0.463	0.537	0.468

The ranking is identical across all five metrics. Against OmniCrack30k (the same nnU-Net v2 self-configuring pipeline trained on a prior composite, as a five-fold ensemble at nnU-Net's default 1,000-epoch schedule), our single-fold 500-epoch model still leads by +0.11 IoU, so the gap reflects the value of the OpenCrack training data within the same automated method, not architecture or training budget.

Limitations

Regime-dependent, not universal. On the out-of-domain PaveSafe wide-scene test the model falls to 0.142 IoU; every crack-trained model collapses into a 0.09–0.14 band there. Accuracy is a property of the imaging regime, not a fixed ranking. Do not deploy on wide road scenes without retraining on that regime.
Thin-crack scoring. IoU is unfair to one-pixel-wide cracks (a perfect prediction can score ≈0.48). clIoU(τ=4) is reported alongside for that reason.
False positives on crack-like structure. The model over-fires on crack-like texture (coverage- fixable with hard-negative enrichment) and on a narrow set of crack-like joints (masonry bonds, mortar lines) that resists enrichment. Plain construction surfaces (e.g. SDNET concrete) stay low.

Citation

@misc{fadeev2026opencrack,
  author       = {Fadeev, V. A.},
  title        = {OpenCrack: A Consolidated, Leakage-Controlled Benchmark for Pavement Crack Segmentation},
  year         = {2026},
  howpublished = {\url{https://github.com/fadeevla/OpenCrack}}
}

Built on nnU-Net (Isensee et al., Nature Methods, 2021, DOI 10.1038/s41592-020-01008-z). If you use this model, please also credit the OpenCrack source datasets you rely on (listed on the benchmark page).

License

Model weights and this card are released under CC-BY-4.0. This does not relicense the source images or annotations used to build OpenCrack; each source dataset remains under its own license.

Downloads last month: -; Downloads are not tracked for this model. How to track