OpenCrack nnU-Net (pavement crack segmentation)
A supervised nnU-Net v2 (2D) model for pixel-level pavement-crack segmentation, trained on the OpenCrack consolidated benchmark. On the held-out OpenCrack test cohort it reaches 0.539 IoU and 0.641 clIoU(ฯ=4), leading three released crack-segmentation baselines (CrackSAM-adapter, OmniCrack30k, Hybrid-Segmentor) by confidence-interval-disjoint margins.
Benchmark and full results: https://github.com/fadeevla/OpenCrack
Intended use
- In-domain: pixel segmentation of cracks in close-up pavement, concrete, masonry, facade, and bridge-deck imagery (the OpenCrack regime).
- Out of scope: wide-angle road scenes with vegetation, sky, and vehicles in frame. The model does not transfer to that regime (see Limitations); use it on close-up surface imagery.
- Target application is offline periodic infrastructure inspection, where there is no tight latency budget.
How to use
This is a standard nnU-Net v2 model. Install nnunetv2, place the checkpoint and the
planner-generated configuration in an nnU-Net results folder, and run the predictor:
The released checkpoint is the 500-epoch one (checkpoint_ep0500.pth); the included
nnUNetPlans.json and dataset.json are the planner-generated configuration.
pip install nnunetv2
# arrange files as Dataset501_OpenCrack/nnUNetTrainer__nnUNetPlans__2d/fold_0/checkpoint_ep0500.pth
# plus dataset.json and nnUNetPlans.json (included with the weights)
nnUNetv2_predict \
-i /path/to/input_images \
-o /path/to/predictions \
-d 501 -c 2d -f 0 \
-chk checkpoint_ep0500.pth
Inputs are RGB surface images; outputs are binary crack masks at input resolution. FP16 inference is verified IoU-identical to FP32. The model runs on an 8 GB consumer GPU at about 1.45 images/sec at 2048ร2048, or about 21 images/sec on 600-pixel tiles.
Training data
Trained on the class-1 (crack) training split of OpenCrack, a consolidation of 32 public crack
datasets (28 COCO dataset_source labels) into one COCO benchmark with a four-class taxonomy,
DINOv2 cross-split deduplication (cosine ฯ = 0.979), and a stratified 70/15/15 split on source,
crack presence, and crack-width band. The deduplication removes the train/test image leakage that
inflates scores on the large composite crack datasets. See the benchmark page for the full source
roster and citations.
Training procedure
Self-configuring nnU-Net v2, trained from random initialisation; the configuration is the one the nnU-Net planner derives from the dataset fingerprint:
- Architecture: plain convolutional U-Net (
PlainConvUNet), seven stages, feature widths[32, 64, 128, 256, 512, 512, 512], InstanceNorm + LeakyReLU. โ92.5 M parameters (encoder โ28.3 M, decoder โ64.2 M; counted from the released checkpoint). - Input: 256ร256 patches, per-image z-normalisation, nnU-Net default augmentation.
- Optimisation: SGD with nnU-Net's default schedule;
nnUNet_compile=f. - Budget: a fixed training budget of about 500 epochs, single fold (fold 0), no early stopping and no multi-fold ensembling. By contrast the released OmniCrack30k uses nnU-Net's default schedule (a five-fold ensemble at the default 1,000 epochs), so this model is the smaller training budget. The released checkpoint is the one that scores the numbers below on the OpenCrack test set.
Evaluation
All models scored on the same OpenCrack held-out stratified test cohort (6,910 crack-pixel positives), one harness, 1,000-resample bootstrap 95% CIs. Image-weighted (micro) means:
| Model | IoU | Dice | Boundary F1 | NSD | clIoU(ฯ=4) |
|---|---|---|---|---|---|
| OpenCrack nnU-Net (this model) | 0.539 | 0.664 | 0.608 | 0.700 | 0.641 |
| CrackSAM-adapter | 0.470 | 0.586 | 0.523 | 0.631 | 0.606 |
| OmniCrack30k | 0.426 | 0.532 | 0.504 | 0.587 | 0.557 |
| Hybrid-Segmentor | 0.402 | 0.493 | 0.463 | 0.537 | 0.468 |
The ranking is identical across all five metrics. Against OmniCrack30k (the same nnU-Net v2 self-configuring pipeline trained on a prior composite, as a five-fold ensemble at nnU-Net's default 1,000-epoch schedule), our single-fold 500-epoch model still leads by +0.11 IoU, so the gap reflects the value of the OpenCrack training data within the same automated method, not architecture or training budget.
Limitations
- Regime-dependent, not universal. On the out-of-domain PaveSafe wide-scene test the model falls to 0.142 IoU; every crack-trained model collapses into a 0.09โ0.14 band there. Accuracy is a property of the imaging regime, not a fixed ranking. Do not deploy on wide road scenes without retraining on that regime.
- Thin-crack scoring. IoU is unfair to one-pixel-wide cracks (a perfect prediction can score โ0.48). clIoU(ฯ=4) is reported alongside for that reason.
- False positives on crack-like structure. The model over-fires on crack-like texture (coverage- fixable with hard-negative enrichment) and on a narrow set of crack-like joints (masonry bonds, mortar lines) that resists enrichment. Plain construction surfaces (e.g. SDNET concrete) stay low.
Citation
@misc{fadeev2026opencrack,
author = {Fadeev, V. A.},
title = {OpenCrack: A Consolidated, Leakage-Controlled Benchmark for Pavement Crack Segmentation},
year = {2026},
howpublished = {\url{https://github.com/fadeevla/OpenCrack}}
}
Built on nnU-Net (Isensee et al., Nature Methods, 2021, DOI 10.1038/s41592-020-01008-z). If you use this model, please also credit the OpenCrack source datasets you rely on (listed on the benchmark page).
License
Model weights and this card are released under CC-BY-4.0. This does not relicense the source images or annotations used to build OpenCrack; each source dataset remains under its own license.