prxkc
/

jersey-number-recognition

+---
+license: mit
+tags:
+- computer-vision
+- sports-analytics
+- jersey-recognition
+- temporal-modeling
+- lstm
+- bilstm
+- pytorch
+datasets:
+- custom
+metrics:
+- accuracy
+model-index:
+- name: jersey-number-recognition
+  results:
+  - task:
+      type: image-classification
+      name: Jersey Number Recognition
+    metrics:
+    - type: accuracy
+      value: 92.12
+      name: Full Number Accuracy
+    - type: accuracy
+      value: 98.63
+      name: Tens Digit Accuracy
+    - type: accuracy
+      value: 93.04
+      name: Units Digit Accuracy
+---
+# Jersey Number Recognition - Temporal BiLSTM Model
+<div align="center">
+  <img src="https://img.shields.io/badge/Accuracy-92.12%25-success" alt="Accuracy"/>
+  <img src="https://img.shields.io/badge/PyTorch-2.0+-red" alt="PyTorch"/>
+  <img src="https://img.shields.io/badge/License-MIT-blue" alt="License"/>
+</div>
+## Model Description
+A BiLSTM-based temporal model for recognizing jersey numbers from video sequences, achieving **92.12% accuracy** - a **43% improvement** over single-frame baselines.
+### Key Features
+- 🎯 **92.12%** full number accuracy
+- 🎯 **98.63%** tens digit accuracy
+- 🎯 **93.04%** units digit accuracy
+- 🎯 **89%** temporal stability across player tracks
+- 🎯 Compositional generalization to 100 classes (00-99)
+## Model Architecture
+```
+Input Sequence [8 × 3 × 128 × 128]
+    ↓
+EfficientNet-B0 Backbone (shared weights)
+    ↓
+256-D Embeddings [8 × 256]
+    ↓
+2-Layer Bidirectional LSTM (hidden: 128)
+    ↓
+Concatenated Hidden States [512]
+    ↓
+    ├─→ Tens Digit Head (10 classes)
+    └─→ Units Digit Head (10 classes)
+```
+**Parameters**: 5.1M
+**Model Size**: 20.3 MB
+## Intended Use
+### Primary Use Cases
+- Jersey number recognition in sports analytics
+- Temporal sequence modeling for visual recognition
+- Research in compositional generalization
+### Out-of-Scope Uses
+- Real-time applications (not optimized for inference speed)
+- Non-sports contexts without fine-tuning
+- Privacy-sensitive applications
+## How to Use
+### Installation
+```bash
+pip install torch torchvision pillow
+```
+### Quick Start
+```python
+import torch
+from PIL import Image
+from huggingface_hub import hf_hub_download
+# Download model
+model_path = hf_hub_download(
+    repo_id="prxkc/jersey-number-recognition",
+    filename="best_temporal.pt"
+)
+# Load checkpoint
+checkpoint = torch.load(model_path, map_location='cpu')
+# Note: You'll need the model architecture code
+# See GitHub repository for complete implementation
+# GitHub: https://github.com/prxkc/jersey-number-recognition
+```
+### Complete Example
+For complete usage with model architecture, see the [GitHub Repository](https://github.com/prxkc/jersey-number-recognition).
+## Training Data
+- **Dataset**: Custom jersey number dataset (subset)
+- **Train samples**: 4,096 sequences
+- **Validation samples**: 860 sequences
+- **Test samples**: 876 sequences
+- **Classes**: 10 jersey numbers (subset of 00-99)
+### Data Preprocessing
+- Frames resized to 128×128 pixels
+- Pad-to-square transformation
+- ImageNet normalization
+- 8 frames uniformly sampled per sequence
+## Training Procedure
+### Hyperparameters
+- **Backbone**: EfficientNet-B0 (pretrained)
+- **Optimizer**: AdamW (lr=2e-4, weight_decay=1e-3)
+- **Scheduler**: Cosine annealing
+- **Batch size**: 32 (temporal), 128 (anchor)
+- **Epochs**: 10 (temporal), 4 (anchor warmstart)
+- **Mixed precision**: Enabled (AMP)
+### Training Strategy
+1. **Warmstart**: Train anchor-only baseline (4 epochs)
+2. **Temporal training**: BiLSTM model (10 epochs)
+3. **Backbone freezing**: First 2 epochs
+4. **Balanced sampling**: Digit-level balancing
+## Evaluation Results
+### Test Set Performance
+| Metric | Anchor (Baseline) | Temporal (Ours) | Improvement |
+|--------|-------------------|-----------------|-------------|
+| Full Number Acc | 48.97% | **92.12%** | +43.15% |
+| Tens Digit Acc | 92.81% | **98.63%** | +5.82% |
+| Units Digit Acc | 53.31% | **93.04%** | +39.73% |
+| Loss | 1.358 | **0.336** | -75.3% |
+### Temporal Stability
+- **89%** of tracks had zero prediction flips
+- **Average 0.11 flips** per track
+- Significant improvement over single-frame predictions
+### Per-Class Results
+| Jersey # | Test Sequences | Accuracy |
+|----------|----------------|----------|
+| 4 | 164 | 95.73% |
+| 6 | 134 | 94.78% |
+| 8 | 301 | 90.70% |
+| 9 | 216 | 90.28% |
+| 48 | 4 | 100.00% |
+| 49 | 19 | 89.47% |
+| 66 | 19 | 100.00% |
+| 89 | 16 | 93.75% |
+## Limitations
+- Trained on limited jersey number subset (10 classes)
+- Not optimized for real-time inference
+- Requires 8-frame sequences (not single images)
+- Performance may degrade on very different visual conditions
+## Contact
+- **Author**: Shakil Islam Shanto
+- **GitHub**: [@prxkc](https://github.com/prxkc)