DEIMv2-X Pill + Needle Detector

A fine-tuned DEIMv2-X (DINOv3 ViT-S+ backbone) model for counting pills and insulin pen needle caps in images. Designed for iOS deployment.

Model Details

Property Value
Architecture DEIMv2-X (50.3M params)
Backbone DINOv3 ViT-S+ (native, not distilled)
Input size 640 x 640
Classes pill (0), needle (1)
Framework DEIMv2

Performance (dataset_406 validation set, 87 images)

Metric Value
Exact match (count) 100%
Within +/-2 100%
Inference speed 24 ms/img (RTX 5090)
Confidence threshold 0.5

Files

File Size Description
best_stg2.pth 781 MB PyTorch checkpoint (for training / fine-tuning)
deimv2_x_pill.mlpackage/ 98 MB CoreML fp16 (iOS deployment)
deimv2_x_pill_int8.mlpackage/ 50 MB CoreML int8 (iOS lightweight)
metadata.json - iOS integration metadata

CoreML Output Format

  • labels (1, 300) int32 โ€” class IDs (0=pill, 1=needle)
  • boxes (1, 300, 4) float16 โ€” bounding boxes in xyxy pixel coordinates
  • scores (1, 300) float16 โ€” confidence scores

iOS (CoreML)

See metadata.json for input/output specifications. Use confidence threshold 0.5 and filter by class ID.

Training

  • Pretrained from COCO checkpoint (DEIMv2-X DINOv3)
  • Fine-tuned 42 epochs (30 + 12 EMA stage) on 383 training images
  • 2-class: pill (images < 343) + needle (images >= 343)
  • 50 hard negative background crops, 4x needle oversample
  • Batch size 16

License

Apache 2.0

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support