--- language: en license: mit tags: - object-detection - oriented-bounding-box - yolo - trading-cards - webgpu-ready - onnx metrics: - mAP50 - mAP50-95 --- # Trading Card Detector ยท YOLO11m OBB (v3) ![preview](assets/validation_predictions.png) ## ๐Ÿš€ Summary - **Architecture:** YOLOv11m with oriented bounding box (OBB) head (7-value outputs: `cx, cy, w, h, conf, class, angle`) - **Input resolution:** 1088 ร— 1088 - **Targets:** Single class (`trading_card`) with arbitrary rotations - **Exports:** PyTorch checkpoint + ONNX (built-in NMS, WebGPU/WebAssembly ready) - **Training script:** `training_app/src/training/train_yolo_obb.py` ## ๐Ÿ“ฆ Artifacts | File | Description | |------|-------------| | `weights/cardcaptor_v3_best.pt` | Full precision PyTorch checkpoint (best epoch) | | `onnx/cardcaptor_v3_best.onnx` | Optimized ONNX export (80 MB) with built-in NMS | ## ๐Ÿ“š Dataset - **Name:** `merged_dataset_50x3` - **Composition:** ~60K synthetic renders + ~220 handcrafted captures ร— deterministic 50ร— photometric multiplier (~11K effective hand-crafted samples) - **Validation:** 55 hand-crafted frames (210 labeled cards) with 90% handcrafted ratio to match real-world deployment - **Label format:** YOLO OBB (four polygon corners per box) ## ๐ŸŽจ Augmentation Highlights - Deterministic photometric pre-augmentation for hand-crafted oversampling (brightness/contrast/gamma/noise/blur/HSV shifts) - YOLO runtime augments: ยฑ60ยฐ rotation, ยฑ0.15 translation, ยฑ0.6 scale, ยฑ20ยฐ shear, perspective jitter, flipud 0.3, fliplr 0.5 - Composition augments: mosaic (1.0), mixup (0.1), copy-paste (0.15), RandAugment, random erasing (0.4) ## ๐Ÿ‹๏ธ Training Recipe - **Batch size:** 8 - **Epochs:** 50 (early stop patience 5) - **Optimizer:** Ultralytics default (SGD w/ warmup) - **Loss head:** OBB-specific DFL + cls losses - **Validation cadence:** Every 5 epochs with periodic visualization dumps - **Hardware:** NVIDIA RTX 5090, CUDA 12.8 ## ๐Ÿ“Š Evaluation (best epoch) | Metric | Value | |--------|-------| | mAP@50 | **0.983** | | mAP@50-95 | **0.963** | | Precision | 0.943 | | Recall | 0.986 | | Validation set | 55 hand-crafted photos / 210 cards | ## ๐Ÿ” Example Predictions | Example 1 | Example 2 | |-----------|-----------| | ![Example 1](assets/examples/ex_1.jpg) | ![Example 2](assets/examples/ex_2.jpg) | | Example 3 | |-----------| | ![Example 3](assets/examples/ex_3.jpg) | ## ๐Ÿงช Inference ```python from ultralytics import YOLO model = YOLO("weights/cardcaptor_v3_best.pt") results = model("card_photo.jpg", conf=0.25) ``` ONNX (WebGPU / WebAssembly) example: ```python import onnxruntime as ort import numpy as np session = ort.InferenceSession("onnx/cardcaptor_v3_best.onnx", providers=["CPUExecutionProvider"]) input_name = session.get_inputs()[0].name outputs = session.run(None, {input_name: image_tensor}) ``` ## โœ… Responsible Use This model only predicts bounding boxes for trading cards. It does not read card text or assess authenticity. ## ๐Ÿ“Ž Resources - Training script reference: `training_app/src/training/train_yolo_obb.py` - Detailed training log & analyzer outputs stored under `trading_cards_obb/yolo11m_obb_merged_50x3/` - Claude design doc: https://claude.ai/share/2886b94f-64a3-4b46-a42c-ba308f106902 --- Questions or improvements? Please open an issue or PR!