- InspectNet-CX
- Native InspectNet-CX Detector (Reverse Distillation)
- Headline: PaDiM to PatchCore Ablation
- Full Ablation Table
- Cable Coreset Sensitivity
- Seed Labeling Note
- Latency
- Accuracy/Cost Tradeoff
- OpenVINO Parity (Measured This Session)
- License
- Checkpoints
- Backbone and Hyperparameters
- Verification
- Caveats
- Reproduction
- Native InspectNet-CX Detector (Reverse Distillation)
InspectNet-CX
InspectNet-CX is a reproducible MVTec AD anomaly-inspection harness evaluated across 4
categories (bottle, cable, capsule, leather). It ships its own native reverse-distillation
detector (the project's from-scratch model) alongside two verified reference baselines,
PaDiM and PatchCore, plus latency and ONNX/OpenVINO export-parity studies. Every number on
this card is traceable to a JSON report in reports/eval_harness/. PatchCore Lightning
checkpoints exist for all 4 categories (bottle and leather across 3 seeds each); the native
reverse-distillation detector ships as source plus eval evidence, with no trained native
checkpoint bundled. See PROVENANCE.md for the per-metric source-of-truth map.
Native InspectNet-CX Detector (Reverse Distillation)
InspectNet-CX's own from-scratch detector is a reverse-distillation model: a frozen
wide_resnet50_2 teacher, with a bottleneck plus decoder trained to reconstruct the
teacher's multi-scale features on normal images, where reconstruction failure is the anomaly
score. It is trained here, not borrowed. No trained checkpoint is bundled; the detector ships
as source plus eval evidence (reports/eval_harness/inspectnet_rd_{category}.json in the
upstream repo).
Image-level AUROC, matched train/test, the native detector against the two reference baselines:
| category | PaDiM (ResNet-18) | PatchCore | InspectNet-CX (reverse distillation, ours) |
|---|---|---|---|
| bottle | 0.998 | 1.000 | 1.000 |
| cable | 0.872 | 0.991 | 0.885 |
| capsule | 0.881 | 0.994 | 0.901 |
| leather | 0.993 | 1.000 | 1.000 |
Honest standing: the native detector ties PatchCore on bottle and leather (both
1.000) and beats PaDiM on all four categories, but it does not beat PatchCore
overall, PatchCore still leads on cable (0.991 vs 0.885) and capsule (0.994 vs 0.901).
PaDiM and PatchCore are references, not the author's results. The native numbers are read
from reports/eval_harness/inspectnet_rd_*.json. The path to this result (a vanilla
student-teacher trails badly, a wider student-teacher backbone collapses to near-chance, and
reverse distillation closes most of the gap) is documented in
docs/native_detector_ablations.md. Fully matching PatchCore on cable and capsule is
open work.
Headline: PaDiM to PatchCore Ablation
PatchCore replaces the PaDiM head from the prior baseline. The decisive wins are on the categories where PaDiM had headroom:
- Cable: image AUROC 0.8720 (PaDiM) -> 0.9910 (PatchCore, coreset 0.01). Delta +0.1190.
- Capsule: image AUROC 0.8807 (PaDiM) -> 0.9944 (PatchCore, coreset 0.01). Delta +0.1137.
Both margins are large enough that single-seed measurement is sufficient at this magnitude (the gap is two orders of magnitude larger than typical PatchCore seed noise). Bottle and leather are wins at the image-AUROC ceiling: 1.0000 across 3 seeds with zero seed variance.
Full Ablation Table
| category | method | coreset | image AUROC | pixel AUROC | AUPRO@0.3 | image delta | pixel delta | AUPRO delta |
|---|---|---|---|---|---|---|---|---|
| bottle | PaDiM | n/a | 0.9976 | 0.9816 | 0.9406 | |||
| bottle | PatchCore | 0.01 | 1.0000 (mean, n=3, std=0) | 0.9852 +/- 0.0001 (n=3) | 0.9406 +/- 0.0005 (n=3) | +0.0024 | +0.0036 | +0.0000 |
| cable | PaDiM | n/a | 0.8720 | 0.9551 | 0.8519 | |||
| cable | PatchCore | 0.01 | 0.9910 (single seed) | 0.9834 (single seed) | 0.9281 (single seed) | +0.1190 | +0.0283 | +0.0761 |
| capsule | PaDiM | n/a | 0.8807 | 0.9849 | 0.9149 | |||
| capsule | PatchCore | 0.01 | 0.9944 (single seed) | 0.9902 (single seed) | 0.9382 (single seed) | +0.1137 | +0.0053 | +0.0233 |
| leather | PaDiM | n/a | 0.9925 | 0.9882 | 0.9682 | |||
| leather | PatchCore | 0.01 | 1.0000 (mean, n=3, std=0) | 0.9922 +/- 0.0001 (n=3) | 0.9752 +/- 0.0006 (n=3) | +0.0075 | +0.0040 | +0.0070 |
Cable and capsule PatchCore rows are single-seed (legacy seed-0, see Seed Labeling Note below); the 0.119 and 0.114 image-AUROC margins over PaDiM are far above plausible seed noise so the verdict is robust. Bottle and leather PatchCore rows are mean +/- pstdev across 3 seeds (seed 0 legacy unseeded plus explicit seeds 1 and 2).
Cable Coreset Sensitivity
| coreset | image AUROC | pixel AUROC | AUPRO@0.3 |
|---|---|---|---|
| 0.01 | 0.9910 | 0.9834 | 0.9281 |
| 0.10 | 0.9856 | 0.9848 | 0.9304 |
| 0.25 | 0.9893 | 0.9844 | 0.9280 |
A 1% coreset matches 10% and 25% within noise on cable, so the paper-default 1% sampling ratio is sufficient for this category.
Seed Labeling Note
"Seed 0" refers to the legacy unseeded baseline run from Phase B (it predates the
--seed flag added in Phase Bx, so its RNG state is not pinned). Seeds 1 and 2
are pinned explicitly via the --seed flag in scripts/train_patchcore.py. All
three runs are reported as-is; we do not pretend they were drawn identically.
For bottle and leather, image AUROC is exactly 1.0000 across all three seeds, so the seed-0 ambiguity is moot at the image-classification level. Pixel AUROC and AUPRO show non-zero seed variance and are reported as mean +/- pstdev (n=3).
Latency
Per-category, per-device latency, measured on the same hardware in this session. All values in milliseconds, batch size 1, image size 256x256, 50 timed images, 10 warmup images (capsule CPU used 30 warmup images, see note).
CPU (AMD Ryzen 9 9900X 12-Core)
| category | min | median | p95 | mean | std | warmup |
|---|---|---|---|---|---|---|
| bottle | 28.318 | 30.155 | 31.569 | 30.178 | 1.012 | 10 |
| cable | 30.415 | 31.297 | 32.908 | 31.463 | 0.807 | 10 |
| capsule | 28.789 | 29.749 | 32.502 | 30.173 | 1.162 | 30 |
| leather | 30.974 | 32.812 | 35.191 | 32.838 | 1.178 | 10 |
The capsule CPU row is taken from patchcore_capsule_latency_rerun2.json (30 warmup
images, std 1.162 ms). The original capsule CPU run had an unstable warm-up tail
that inflated std; the rerun is the clean number to cite.
CUDA (NVIDIA GeForce RTX 5070, driver 570.211.01, 12227 MiB)
| category | min | median | p95 | mean | std | warmup |
|---|---|---|---|---|---|---|
| bottle | 5.144 | 5.202 | 6.415 | 5.508 | 0.473 | 10 |
| cable | 5.130 | 5.290 | 6.558 | 5.513 | 0.465 | 10 |
| capsule | 5.109 | 5.354 | 6.179 | 5.474 | 0.376 | 10 |
| leather | 5.171 | 5.350 | 6.365 | 5.613 | 0.468 | 10 |
Platform: Linux-6.8.0-117-generic-x86_64-with-glibc2.35, Python 3.10.12, Torch 2.11.0+cu128, Anomalib 2.4.1.
Accuracy/Cost Tradeoff
PatchCore is more accurate than PaDiM on all 4 categories (cable +0.119 image AUROC, capsule +0.114, leather +0.0075, bottle +0.0024), but at higher inference cost: CPU median ~30 ms/image vs PaDiM's lighter coupling, and CUDA median ~5.2-5.6 ms/image. The CPU cost is dominated by the wide_resnet50_2 backbone and the memory-bank nearest-neighbor lookup. On CUDA the model is fast enough for real-time-class inspection workloads; on CPU it sits in the tens of ms.
OpenVINO Parity (Measured This Session)
Fresh PatchCore ONNX and OpenVINO exports were produced via Anomalib's
Engine.export(export_type=ExportType.ONNX|OPENVINO, ...) from each trained
Lightning checkpoint. Outputs were compared on N=20 real MVTec AD test images per
category (mix of normal + anomalous) under ONNX Runtime CPU (FP32) and OpenVINO
CPU with INFERENCE_PRECISION_HINT=f32. Inference precision hint matters: leaving
it at the CPU plugin default can silently engage bf16 on AVX-512-BF16 hosts and
break parity, which is why the f32 hint is explicit.
| category | max abs error (anomaly map) | max abs error (pred score) | pred_label flips (N=20) | pred_mask pixel flips (out of 1,310,720) | source JSON |
|---|---|---|---|---|---|
| bottle | 2.181e-05 | 6.020e-06 | 0/20 | 0 | reports/eval_harness/openvino_parity_patchcore_bottle.json |
| cable | 4.768e-06 | 3.278e-06 | 0/20 | 0 | reports/eval_harness/openvino_parity_patchcore_cable.json |
| capsule | 7.719e-06 | 4.053e-06 | 0/20 | 0 | reports/eval_harness/openvino_parity_patchcore_capsule.json |
| leather | 4.321e-06 | 7.299e-05 (pred_score) | 0/20 | 0 | reports/eval_harness/openvino_parity_patchcore_leather.json |
All 4 categories status parity_clean per the JSON definition (max_abs_error <=
1e-3, zero label flips, zero mask pixel flips). ONNX Runtime 1.23.2, OpenVINO
2026.1.0-21367-63e31528c62-releases/2026/1.
This is a fresh PatchCore parity measurement. The earlier commit c3594fc covered
PaDiM only and does not transfer to PatchCore by extrapolation; this measurement
replaces that.
License
Package and Card
The code in the InspectNet-CX package, this model card, the per-category result JSON files, and the parity reports are licensed under Apache-2.0.
MVTec AD Dataset Restriction (Important)
The trained PatchCore checkpoints were fit on the MVTec AD dataset, which is distributed under CC BY-NC-SA 4.0. That license is non-commercial.
This restriction propagates to the trained checkpoints. Even though the package code is Apache-2.0, downstream commercial use of the trained PatchCore checkpoints (or any derivative model that was fit on MVTec AD images) is not permitted under MVTec AD's terms. The dataset license overrides the package license for any artifact whose weights or memory bank were built from MVTec AD pixels.
If you want commercial use, you must retrain the per-category PatchCore detector on your own commercially-licensed data using the package code, and the non-commercial restriction does not apply to the resulting weights.
Checkpoints
PatchCore Lightning checkpoints used for the numbers in this card:
| category | seed | coreset | SHA256 | size MB |
|---|---|---|---|---|
| bottle | 0 (legacy unseeded) | 0.01 | b0eb8834ae8d2bece3270cd1ef003427f16e0a109cc6bbb3a85eea49e50461df |
107.6 |
| bottle | 1 | 0.01 | d89e12adc18b806e3da552d261c33b113422bf3e49068c73b2e1223816cabd12 |
107.6 |
| bottle | 2 | 0.01 | b4afc04f0af2dd70de8393754ed47f276cc2778a6ec9e2d87431e894dcedb725 |
107.6 |
| cable | 0 (legacy unseeded) | 0.01 | 29d451c6a03707c155adaf1e5bf33313531c9d8204d8722cbe8f9516aac930c2 |
108.5 |
| capsule | 0 (legacy unseeded) | 0.01 | 25454995713926187e9816613d0e76e8e9531d6ca99becdf2565e8e8ebda8feb |
108.2 |
| leather | 0 (legacy unseeded) | 0.01 | 5cf7c7a793ad441a9c6cd92ee27517c674df35720c1873e61ef7aab5ebc2bd29 |
109.8 |
| leather | 1 | 0.01 | 5af3f908dae9df60fe472718b588deeaaeb93ce7b7b8d286c9077df098375d65 |
109.8 |
| leather | 2 | 0.01 | 268b1d0819ef50353a0ed874dc84ce2f38d5fe1686978ed284f881c5532fbc0e |
109.8 |
Checkpoints are not bundled in this HF repo. They live in the upstream training
tree under artifacts/patchcore_{cat}[_seed{N}]/Patchcore/MVTecAD/{cat}/v0/weights/lightning/model.ckpt
and are reproducible from the documented training commands.
Backbone and Hyperparameters
- Backbone:
wide_resnet50_2(timm). - Feature layers:
layer2,layer3. - Coreset sampling ratio: 0.01 (main runs), with 0.10 and 0.25 sweep on cable.
- Image size: 256x256, RGB, BILINEAR resize, divide-by-255 normalization.
- Train/test split: MVTec AD default per-category split.
Verification
See CHECKSUMS.sha256 for SHA256 of every non-README file shipped in this repo.
Verify with:
sha256sum -c CHECKSUMS.sha256
See PROVENANCE.md for the metric-to-JSON map. Every number in the YAML
model-index block and in the ablation, coreset, latency, and parity tables
points to a specific field in a specific JSON under reports/eval_harness/.
Caveats
- Cable and capsule PatchCore rows are single-seed; the +0.119 / +0.114 image AUROC margins over PaDiM are large enough that this is acceptable, but the caveat is real.
- Seed 0 across categories is the legacy unseeded baseline run; only seeds 1 and 2 have pinned RNG state.
- MVTec AD non-commercial license (CC BY-NC-SA 4.0) propagates to the checkpoints and overrides the package Apache-2.0 for downstream commercial use.
- No Jetson, TensorRT, or edge-hardware validation has been performed. CPU latency is on an AMD Ryzen 9 9900X workstation, not on target inspection hardware.
- Pixel-level evaluation uses the standard MVTec AD pixel AUROC and AUPRO@FPR=0.3 with no additional production thresholding.
Reproduction
PaDiM and PatchCore evaluation harness, latency benchmark, and parity script are in the upstream repo:
PYTHONPATH=src python3 scripts/eval_harness.py --method patchcore --dataset mvtec_ad --category cable --coreset 0.01 --output reports/eval_harness/patchcore_cable.json
PYTHONPATH=src python3 scripts/train_patchcore.py --category leather --seed 1 --output artifacts/patchcore_leather_seed1
PYTHONPATH=src python3 scripts/bench_latency.py --checkpoint artifacts/patchcore_bottle/Patchcore/MVTecAD/bottle/v0/weights/lightning/model.ckpt --category bottle --output reports/eval_harness/patchcore_bottle_latency.json
PYTHONPATH=src python3 scripts/validate_patchcore_export.py --category bottle --checkpoint artifacts/patchcore_bottle/Patchcore/MVTecAD/bottle/v0/weights/lightning/model.ckpt --output reports/eval_harness/openvino_parity_patchcore_bottle.json
Evaluation results
- Image AUROC (mean over 3 seeds) on MVTec AD (bottle)self-reported1.000
- Pixel AUROC (mean over 3 seeds) on MVTec AD (bottle)self-reported0.985
- AUPRO@FPR=0.3 (mean over 3 seeds) on MVTec AD (bottle)self-reported0.941
- Image AUROC (single seed, coreset 0.01) on MVTec AD (cable)self-reported0.991
- Pixel AUROC (single seed, coreset 0.01) on MVTec AD (cable)self-reported0.983
- AUPRO@FPR=0.3 (single seed, coreset 0.01) on MVTec AD (cable)self-reported0.928
- Image AUROC (single seed, coreset 0.01) on MVTec AD (capsule)self-reported0.994
- Pixel AUROC (single seed, coreset 0.01) on MVTec AD (capsule)self-reported0.990