DEIMv2-X Pill + Needle Detector
A fine-tuned DEIMv2-X (DINOv3 ViT-S+ backbone) model for counting pills and insulin pen needle caps in images. Designed for iOS deployment.
Model Details
| Property | Value |
|---|---|
| Architecture | DEIMv2-X (50.3M params) |
| Backbone | DINOv3 ViT-S+ (native, not distilled) |
| Input size | 640 x 640 |
| Classes | pill (0), needle (1) |
| Framework | DEIMv2 |
Performance (dataset_406 validation set, 87 images)
| Metric | Value |
|---|---|
| Exact match (count) | 100% |
| Within +/-2 | 100% |
| Inference speed | 24 ms/img (RTX 5090) |
| Confidence threshold | 0.5 |
Files
| File | Size | Description |
|---|---|---|
best_stg2.pth |
781 MB | PyTorch checkpoint (for training / fine-tuning) |
deimv2_x_pill.mlpackage/ |
98 MB | CoreML fp16 (iOS deployment) |
deimv2_x_pill_int8.mlpackage/ |
50 MB | CoreML int8 (iOS lightweight) |
metadata.json |
- | iOS integration metadata |
CoreML Output Format
labels(1, 300) int32 โ class IDs (0=pill, 1=needle)boxes(1, 300, 4) float16 โ bounding boxes in xyxy pixel coordinatesscores(1, 300) float16 โ confidence scores
iOS (CoreML)
See metadata.json for input/output specifications. Use confidence threshold 0.5 and filter by class ID.
Training
- Pretrained from COCO checkpoint (DEIMv2-X DINOv3)
- Fine-tuned 42 epochs (30 + 12 EMA stage) on 383 training images
- 2-class: pill (images < 343) + needle (images >= 343)
- 50 hard negative background crops, 4x needle oversample
- Batch size 16
License
Apache 2.0
- Downloads last month
- -