|
|
--- |
|
|
language: en |
|
|
license: mit |
|
|
tags: |
|
|
- object-detection |
|
|
- oriented-bounding-box |
|
|
- yolo |
|
|
- trading-cards |
|
|
- webgpu-ready |
|
|
- onnx |
|
|
metrics: |
|
|
- mAP50 |
|
|
- mAP50-95 |
|
|
--- |
|
|
|
|
|
# Trading Card Detector Β· YOLO11m OBB (v3) |
|
|
|
|
|
 |
|
|
|
|
|
## π Summary |
|
|
- **Architecture:** YOLOv11m with oriented bounding box (OBB) head (7-value outputs: `cx, cy, w, h, conf, class, angle`) |
|
|
- **Input resolution:** 1088 Γ 1088 |
|
|
- **Targets:** Single class (`trading_card`) with arbitrary rotations |
|
|
- **Exports:** PyTorch checkpoint + ONNX (built-in NMS, WebGPU/WebAssembly ready) |
|
|
- **Training script:** `training_app/src/training/train_yolo_obb.py` |
|
|
|
|
|
## π¦ Artifacts |
|
|
| File | Description | |
|
|
|------|-------------| |
|
|
| `weights/cardcaptor_v3_best.pt` | Full precision PyTorch checkpoint (best epoch) | |
|
|
| `onnx/cardcaptor_v3_best.onnx` | Optimized ONNX export (80 MB) with built-in NMS | |
|
|
|
|
|
## π Dataset |
|
|
- **Name:** `merged_dataset_50x3` |
|
|
- **Composition:** ~60K synthetic renders + ~220 handcrafted captures Γ deterministic 50Γ photometric multiplier (~11K effective hand-crafted samples) |
|
|
- **Validation:** 55 hand-crafted frames (210 labeled cards) with 90% handcrafted ratio to match real-world deployment |
|
|
- **Label format:** YOLO OBB (four polygon corners per box) |
|
|
|
|
|
## π¨ Augmentation Highlights |
|
|
- Deterministic photometric pre-augmentation for hand-crafted oversampling (brightness/contrast/gamma/noise/blur/HSV shifts) |
|
|
- YOLO runtime augments: Β±60Β° rotation, Β±0.15 translation, Β±0.6 scale, Β±20Β° shear, perspective jitter, flipud 0.3, fliplr 0.5 |
|
|
- Composition augments: mosaic (1.0), mixup (0.1), copy-paste (0.15), RandAugment, random erasing (0.4) |
|
|
|
|
|
## ποΈ Training Recipe |
|
|
- **Batch size:** 8 |
|
|
- **Epochs:** 50 (early stop patience 5) |
|
|
- **Optimizer:** Ultralytics default (SGD w/ warmup) |
|
|
- **Loss head:** OBB-specific DFL + cls losses |
|
|
- **Validation cadence:** Every 5 epochs with periodic visualization dumps |
|
|
- **Hardware:** NVIDIA RTX 5090, CUDA 12.8 |
|
|
|
|
|
## π Evaluation (best epoch) |
|
|
| Metric | Value | |
|
|
|--------|-------| |
|
|
| mAP@50 | **0.983** | |
|
|
| mAP@50-95 | **0.963** | |
|
|
| Precision | 0.943 | |
|
|
| Recall | 0.986 | |
|
|
| Validation set | 55 hand-crafted photos / 210 cards | |
|
|
|
|
|
## π Example Predictions |
|
|
| Example 1 | Example 2 | |
|
|
|-----------|-----------| |
|
|
|  |  | |
|
|
|
|
|
| Example 3 | |
|
|
|-----------| |
|
|
|  | |
|
|
|
|
|
## π§ͺ Inference |
|
|
```python |
|
|
from ultralytics import YOLO |
|
|
|
|
|
model = YOLO("weights/cardcaptor_v3_best.pt") |
|
|
results = model("card_photo.jpg", conf=0.25) |
|
|
``` |
|
|
|
|
|
ONNX (WebGPU / WebAssembly) example: |
|
|
```python |
|
|
import onnxruntime as ort |
|
|
import numpy as np |
|
|
|
|
|
session = ort.InferenceSession("onnx/cardcaptor_v3_best.onnx", providers=["CPUExecutionProvider"]) |
|
|
input_name = session.get_inputs()[0].name |
|
|
outputs = session.run(None, {input_name: image_tensor}) |
|
|
``` |
|
|
|
|
|
## β
Responsible Use |
|
|
This model only predicts bounding boxes for trading cards. It does not read card text or assess authenticity. |
|
|
|
|
|
## π Resources |
|
|
- Training script reference: `training_app/src/training/train_yolo_obb.py` |
|
|
- Detailed training log & analyzer outputs stored under `trading_cards_obb/yolo11m_obb_merged_50x3/` |
|
|
- Claude design doc: https://claude.ai/share/2886b94f-64a3-4b46-a42c-ba308f106902 |
|
|
|
|
|
--- |
|
|
Questions or improvements? Please open an issue or PR! |
|
|
|