prxkc commited on
Commit
e206dc9
Β·
verified Β·
1 Parent(s): 8182083

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +190 -0
README.md ADDED
@@ -0,0 +1,190 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - computer-vision
5
+ - sports-analytics
6
+ - jersey-recognition
7
+ - temporal-modeling
8
+ - lstm
9
+ - bilstm
10
+ - pytorch
11
+ datasets:
12
+ - custom
13
+ metrics:
14
+ - accuracy
15
+ model-index:
16
+ - name: jersey-number-recognition
17
+ results:
18
+ - task:
19
+ type: image-classification
20
+ name: Jersey Number Recognition
21
+ metrics:
22
+ - type: accuracy
23
+ value: 92.12
24
+ name: Full Number Accuracy
25
+ - type: accuracy
26
+ value: 98.63
27
+ name: Tens Digit Accuracy
28
+ - type: accuracy
29
+ value: 93.04
30
+ name: Units Digit Accuracy
31
+ ---
32
+
33
+ # Jersey Number Recognition - Temporal BiLSTM Model
34
+
35
+ <div align="center">
36
+ <img src="https://img.shields.io/badge/Accuracy-92.12%25-success" alt="Accuracy"/>
37
+ <img src="https://img.shields.io/badge/PyTorch-2.0+-red" alt="PyTorch"/>
38
+ <img src="https://img.shields.io/badge/License-MIT-blue" alt="License"/>
39
+ </div>
40
+
41
+ ## Model Description
42
+
43
+ A BiLSTM-based temporal model for recognizing jersey numbers from video sequences, achieving **92.12% accuracy** - a **43% improvement** over single-frame baselines.
44
+
45
+ ### Key Features
46
+
47
+ - 🎯 **92.12%** full number accuracy
48
+ - 🎯 **98.63%** tens digit accuracy
49
+ - 🎯 **93.04%** units digit accuracy
50
+ - 🎯 **89%** temporal stability across player tracks
51
+ - 🎯 Compositional generalization to 100 classes (00-99)
52
+
53
+ ## Model Architecture
54
+ ```
55
+ Input Sequence [8 Γ— 3 Γ— 128 Γ— 128]
56
+ ↓
57
+ EfficientNet-B0 Backbone (shared weights)
58
+ ↓
59
+ 256-D Embeddings [8 Γ— 256]
60
+ ↓
61
+ 2-Layer Bidirectional LSTM (hidden: 128)
62
+ ↓
63
+ Concatenated Hidden States [512]
64
+ ↓
65
+ β”œβ”€β†’ Tens Digit Head (10 classes)
66
+ └─→ Units Digit Head (10 classes)
67
+ ```
68
+
69
+ **Parameters**: 5.1M
70
+ **Model Size**: 20.3 MB
71
+
72
+ ## Intended Use
73
+
74
+ ### Primary Use Cases
75
+
76
+ - Jersey number recognition in sports analytics
77
+ - Temporal sequence modeling for visual recognition
78
+ - Research in compositional generalization
79
+
80
+ ### Out-of-Scope Uses
81
+
82
+ - Real-time applications (not optimized for inference speed)
83
+ - Non-sports contexts without fine-tuning
84
+ - Privacy-sensitive applications
85
+
86
+ ## How to Use
87
+
88
+ ### Installation
89
+ ```bash
90
+ pip install torch torchvision pillow
91
+ ```
92
+
93
+ ### Quick Start
94
+ ```python
95
+ import torch
96
+ from PIL import Image
97
+ from huggingface_hub import hf_hub_download
98
+
99
+ # Download model
100
+ model_path = hf_hub_download(
101
+ repo_id="prxkc/jersey-number-recognition",
102
+ filename="best_temporal.pt"
103
+ )
104
+
105
+ # Load checkpoint
106
+ checkpoint = torch.load(model_path, map_location='cpu')
107
+
108
+ # Note: You'll need the model architecture code
109
+ # See GitHub repository for complete implementation
110
+ # GitHub: https://github.com/prxkc/jersey-number-recognition
111
+ ```
112
+
113
+ ### Complete Example
114
+
115
+ For complete usage with model architecture, see the [GitHub Repository](https://github.com/prxkc/jersey-number-recognition).
116
+
117
+ ## Training Data
118
+
119
+ - **Dataset**: Custom jersey number dataset (subset)
120
+ - **Train samples**: 4,096 sequences
121
+ - **Validation samples**: 860 sequences
122
+ - **Test samples**: 876 sequences
123
+ - **Classes**: 10 jersey numbers (subset of 00-99)
124
+
125
+ ### Data Preprocessing
126
+
127
+ - Frames resized to 128Γ—128 pixels
128
+ - Pad-to-square transformation
129
+ - ImageNet normalization
130
+ - 8 frames uniformly sampled per sequence
131
+
132
+ ## Training Procedure
133
+
134
+ ### Hyperparameters
135
+
136
+ - **Backbone**: EfficientNet-B0 (pretrained)
137
+ - **Optimizer**: AdamW (lr=2e-4, weight_decay=1e-3)
138
+ - **Scheduler**: Cosine annealing
139
+ - **Batch size**: 32 (temporal), 128 (anchor)
140
+ - **Epochs**: 10 (temporal), 4 (anchor warmstart)
141
+ - **Mixed precision**: Enabled (AMP)
142
+
143
+ ### Training Strategy
144
+
145
+ 1. **Warmstart**: Train anchor-only baseline (4 epochs)
146
+ 2. **Temporal training**: BiLSTM model (10 epochs)
147
+ 3. **Backbone freezing**: First 2 epochs
148
+ 4. **Balanced sampling**: Digit-level balancing
149
+
150
+ ## Evaluation Results
151
+
152
+ ### Test Set Performance
153
+
154
+ | Metric | Anchor (Baseline) | Temporal (Ours) | Improvement |
155
+ |--------|-------------------|-----------------|-------------|
156
+ | Full Number Acc | 48.97% | **92.12%** | +43.15% |
157
+ | Tens Digit Acc | 92.81% | **98.63%** | +5.82% |
158
+ | Units Digit Acc | 53.31% | **93.04%** | +39.73% |
159
+ | Loss | 1.358 | **0.336** | -75.3% |
160
+
161
+ ### Temporal Stability
162
+
163
+ - **89%** of tracks had zero prediction flips
164
+ - **Average 0.11 flips** per track
165
+ - Significant improvement over single-frame predictions
166
+
167
+ ### Per-Class Results
168
+
169
+ | Jersey # | Test Sequences | Accuracy |
170
+ |----------|----------------|----------|
171
+ | 4 | 164 | 95.73% |
172
+ | 6 | 134 | 94.78% |
173
+ | 8 | 301 | 90.70% |
174
+ | 9 | 216 | 90.28% |
175
+ | 48 | 4 | 100.00% |
176
+ | 49 | 19 | 89.47% |
177
+ | 66 | 19 | 100.00% |
178
+ | 89 | 16 | 93.75% |
179
+
180
+ ## Limitations
181
+
182
+ - Trained on limited jersey number subset (10 classes)
183
+ - Not optimized for real-time inference
184
+ - Requires 8-frame sequences (not single images)
185
+ - Performance may degrade on very different visual conditions
186
+
187
+ ## Contact
188
+
189
+ - **Author**: Shakil Islam Shanto
190
+ - **GitHub**: [@prxkc](https://github.com/prxkc)