nms-verified

FCOS-ResNet50 trained so that the one-peak invariant proved in CharlesCNorton/nms-verified holds on its dense output. Under that invariant, greedy NMS is provably equivalent to a sort + threshold filter on the score axis. The deployment graph can drop NMS at zero AP cost.

Theorems

The companion Rocq library is structured into four modules (Core.v, Bridge.v, Probability.v, Pipeline.v). The keystone collapse theorem in Core.v is

Theorem nms_collapse_onepeak :
  forall D,
    NoDup D -> sorted_desc D -> one_peak D -> no_tie_clash D ->
    filter_above (nms_sorted D) = filter_above D.

instantiated for box-NMS, soft-NMS, mask-NMS, heatmap local-NMS, and DETR-style filtering.

The bridge from a Lipschitz score head to Separated is in Bridge.v (lipschitz_bridge_substantive, SepRespectingHead). Architectural discharges of the bridge precondition appear in Pipeline.v:

Theorem detr_equilibrium_yields_separated  (* DETR / set-prediction *)
Theorem anchor_stride_yields_separated     (* FCOS / RetinaNet / ATSS *)

The training-side chain from SGD to the Separated locus is closed end to end in Pipeline.v:

Theorem squared_hinge_sgd_pl_convergence
Theorem L_separated_sq_zero_iff_separated_general
Theorem real_training_to_collapse

PAC generalization from a finite training sample is in Probability.v via Hoeffding's lemma (hoeffding_lemma_symmetric, hoeffding_iid_finite_class_one_sided).

All proofs closed in Rocq 9 with no admits and no axioms beyond Stdlib's standard R foundations.

COCO val2017 mAP

model	std AP	NMS-free AP
pretrained FCOS_ResNet50_FPN_Weights.COCO_V1	0.391	0.174
this checkpoint	0.265	0.264

Without the invariant, NMS carries ~22 AP. With it, the deployment graph can drop NMS at 0.001 AP cost.

Recipe

Self-distillation from the pretrained teacher on COCO train2017, frozen backbone, image size 640, no spectral-norm parametrisation, IoU-selectivity + ratio penalty + dense-output mimic loss.

python experiments/train_with_spectral_norm.py \
  --iters 120000 \
  --batch-size 1 --image-size 640 --lr 1e-4 \
  --peak-lambda 0.0 \
  --iou-selectivity-lambda 200.0 \
  --iou-selectivity-tau 0.5 --iou-selectivity-theta 0.3 \
  --iou-selectivity-topk 2000 \
  --iou-ratio-lambda 200.0 --iou-ratio-target 0.3 \
  --center-sampling-radius 0.5 \
  --no-spectral-norm \
  --self-distill --teacher-theta 0.5 \
  --mimic-lambda 1.0 \
  --coco-image-dir /path/to/coco/train2017 --coco-image-limit 0

Load and run

import torch
from safetensors.torch import load_file
from torchvision.models.detection import (
    fcos_resnet50_fpn, FCOS_ResNet50_FPN_Weights,
)
from torchvision.ops import boxes as box_ops

model = fcos_resnet50_fpn(weights=FCOS_ResNet50_FPN_Weights.COCO_V1)
model.load_state_dict(load_file("fcos_certified.safetensors"))
model.eval().to("cuda")

# Standard inference: torchvision's stock postprocess includes NMS.
detections = model([image_tensor])

# NMS-free inference: replace postprocess_detections with the same code
# minus the batched_nms call. Output equals the standard path under the
# one-peak invariant (Theorem nms_collapse_onepeak in the Rocq library).
def nms_free_postproc(self, head_outputs, anchors, image_shapes):
    cl, br, bc = head_outputs["cls_logits"], head_outputs["bbox_regression"], head_outputs["bbox_ctrness"]
    out = []
    for i, shape in enumerate(image_shapes):
        boxes_i, scores_i, labels_i = [], [], []
        for br_l, cl_l, bc_l, anc_l in zip(br, cl, bc, anchors[i]):
            num_classes = cl_l.shape[-1]
            scores = torch.sqrt(cl_l[i].sigmoid() * bc_l[i].sigmoid()).flatten()
            keep = scores > self.score_thresh
            scores = scores[keep]
            topk = torch.where(keep)[0]
            k = min(topk.numel(), self.topk_candidates)
            scores, order = scores.topk(k)
            topk = topk[order]
            anc_idx = torch.div(topk, num_classes, rounding_mode="floor")
            labels = topk % num_classes
            boxes = self.box_coder.decode(br_l[i][anc_idx], anc_l[anc_idx])
            boxes_i.append(box_ops.clip_boxes_to_image(boxes, shape))
            scores_i.append(scores); labels_i.append(labels)
        boxes_i = torch.cat(boxes_i); scores_i = torch.cat(scores_i); labels_i = torch.cat(labels_i)
        order = scores_i.argsort(descending=True)[:self.detections_per_img]
        out.append({"boxes": boxes_i[order], "scores": scores_i[order], "labels": labels_i[order]})
    return out

import types
model.postprocess_detections = types.MethodType(nms_free_postproc, model)
detections_nms_free = model([image_tensor])

Reproducing the eval

python experiments/coco_eval.py \
  --ckpt fcos_certified.safetensors \
  --theta 0.05 --top-k 100

Reports both standard and NMS-free AP on COCO val2017.

Certification

nms_cert.ml is the OCaml extraction of the decidable certifier (Separated_check, Separated_dec, sep_certify_finite) from the Rocq library. nms_cert_main.ml is a small CLI driver: detections in, CERTIFIED <slack> or REJECTED out. Build:

ocamlfind ocamlopt nms_cert.ml nms_cert_main.ml -o nms_cert

Usage:

./nms_cert TAU THETA SLACK < detections.txt

Each input line is whitespace-separated score x1 y1 x2 y2 with integer scores and integer bounding-box pixels. CERTIFIED means the corresponding Separated predicate holds on the input list at that slack — by nms_collapse_onepeak, NMS is then equivalent to a threshold filter on the same list.

Files

fcos_certified.safetensors — model weights (FCOS-ResNet50-FPN, 319 tensors).
nms_cert.ml — extracted certifier (Separated_check, Separated_dec, sep_certify_finite).
nms_cert_main.ml — CLI driver around the extracted certifier.
README.md — this card.

Citation

@misc{nms_verified,
  author       = {Norton, Charles C.},
  title        = {nms-verified: a Rocq formalization of NMS collapse for dense detection},
  howpublished = {\url{https://github.com/CharlesCNorton/nms-verified}},
  note         = {Model checkpoint at \url{https://huggingface.co/phanerozoic/nms-verified}},
}

License

MIT.

Downloads last month: -; Downloads are not tracked for this model. How to track

Evaluation results

AP (standard NMS) on COCO val2017
validation set self-reported

0.265
AP (NMS-free deployment) on COCO val2017
validation set self-reported

0.264