Update README.md
Browse files
README.md
CHANGED
|
@@ -19,3 +19,83 @@ Broadcast overlay systems displaying a real-time model-driven strike zone visual
|
|
| 19 |
Umpire analytics platforms comparing model calls vs. official calls per game
|
| 20 |
Research and development for automated ball-strike officiating systems
|
| 21 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 19 |
Umpire analytics platforms comparing model calls vs. official calls per game
|
| 20 |
Research and development for automated ball-strike officiating systems
|
| 21 |
|
| 22 |
+
2. Training Data
|
| 23 |
+
Dataset Source
|
| 24 |
+
Base dataset: ROBO ump Roboflow Universe, accessed 2025
|
| 25 |
+
https://universe.roboflow.com/toasty-workspace/roboump
|
| 26 |
+
|
| 27 |
+
Platform: Roboflow Universe (roboflow.com/universe)
|
| 28 |
+
Collection: Broadcast MLB footage (center-field fixed-angle camera)
|
| 29 |
+
Resolution: 640 × 640 px (resized)
|
| 30 |
+
|
| 31 |
+
Class Distribution
|
| 32 |
+
|
| 33 |
+
ball: ~551 images — 70 / 20 / 10 split
|
| 34 |
+
batter: ~774 images — 70 / 20 / 10 split
|
| 35 |
+
pitcher: ~774 images — 70 / 20 / 10 split
|
| 36 |
+
strike_zone: ~300 images — 70 / 20 / 10 split
|
| 37 |
+
|
| 38 |
+
Annotation Process
|
| 39 |
+
The base dataset provided semi-automated annotations for ball, batter, and pitcher. Strike zone annotations were not included and were added entirely through manual labeling approximately 300 strike zones annotated in Roboflow, defined as the rectangular region from the batter's knees to the midpoint of the torso.
|
| 40 |
+
A 10% quality review found false positives in the ball class (crowd balls, logos, advertising). Around 5% of ball annotations were removed. Roughly 3 hours of total manual correction work was spent tightening boxes and adding missed detections across all classes.
|
| 41 |
+
Data Augmentation
|
| 42 |
+
|
| 43 |
+
Horizontal flip (50% probability)
|
| 44 |
+
Color augmentation and saturation jitter
|
| 45 |
+
|
| 46 |
+
3. Training Procedure
|
| 47 |
+
|
| 48 |
+
Framework: Ultralytics
|
| 49 |
+
Hardware: GPU
|
| 50 |
+
Batch size: Default (auto-tuned)
|
| 51 |
+
Epochs: 50
|
| 52 |
+
Image size: 640
|
| 53 |
+
Early stopping: Not applied
|
| 54 |
+
Preprocessing: Auto-resize to 640 × 640, normalization
|
| 55 |
+
|
| 56 |
+
4. Evaluation Results
|
| 57 |
+
Overall Metrics
|
| 58 |
+
|
| 59 |
+
mAP@50: 0.92 (target > 0.85)
|
| 60 |
+
Overall F1: 0.91 (target >= 0.80)
|
| 61 |
+
|
| 62 |
+
Per-Class Breakdown
|
| 63 |
+
|
| 64 |
+
ball: mAP@50 = 0.72, F1 = 0.76
|
| 65 |
+
batter: mAP@50 = 0.97, F1 = 0.95
|
| 66 |
+
pitcher: mAP@50 = 0.97, F1 = 0.84
|
| 67 |
+
strike_zone: mAP@50 = 0.82, F1 = 0.88
|
| 68 |
+
|
| 69 |
+
Confusion Matrix Summary
|
| 70 |
+
|
| 71 |
+
Ball: 69 correctly predicted, 33 missed as background (~32% miss rate)
|
| 72 |
+
Batter: 145/145 correct
|
| 73 |
+
Pitcher: 145/145 correct
|
| 74 |
+
Strike_zone: 44/47 correct, 3 missed as background
|
| 75 |
+
Background false positives: 9 labeled as ball, 4 as strike_zone
|
| 76 |
+
|
| 77 |
+
Performance Analysis
|
| 78 |
+
The overall mAP@50 of 0.92 exceeds the 0.85 target and F1 of 0.91 clears the 0.80 threshold. Batter and pitcher detection is near-perfect, reflecting their large consistent silhouettes. Training and validation loss curves show steady convergence with no signs of overfitting.
|
| 79 |
+
The ball class is the most critical failure point. A mAP@50 of 0.72 means roughly 1 in 4 pitches in flight is missed or incorrectly detected. Since ball is the most important class for pitch classification, this gap matters more than the strong overall score suggests. The strike_zone class achieved F1 = 0.88 despite having the fewest training instances, though its mAP@50 of 0.82 falls just short of target.
|
| 80 |
+
|
| 81 |
+
5. Limitations and Biases
|
| 82 |
+
Known Failure Cases
|
| 83 |
+
|
| 84 |
+
Ball near white uniforms: Low contrast causes false negatives when the ball overlaps light-colored jerseys
|
| 85 |
+
Strike zone partially off-frame: Bounding box gets truncated when the batter stands near the frame edge
|
| 86 |
+
Occlusion: When the catcher, batter, or pitcher obscures the ball at the plate, recall drops significantly
|
| 87 |
+
Low-light or compressed frames: JPEG artifacts confuse the detector, especially for the small ball class
|
| 88 |
+
|
| 89 |
+
Poor-Performing Class: Ball
|
| 90 |
+
The ball class is the weakest performer (mAP@50 = 0.72) because the ball is typically only 10-20 pixels across at broadcast resolution and moves at 80-100 mph, causing motion blur in standard 30fps footage. Background crowd balls and logos also created false positives requiring manual cleanup.
|
| 91 |
+
Contextual Limitations
|
| 92 |
+
|
| 93 |
+
Fixed camera angle required: The model only works on center-field broadcast footage and will not generalize to other angles
|
| 94 |
+
No 3D depth: A 2D bounding box cannot confirm a pitch actually crossed through the zone, only that it was near it
|
| 95 |
+
Strike zone variability: The zone was manually estimated per frame and varies by batter height, introducing annotator variability
|
| 96 |
+
|
| 97 |
+
Inappropriate Use Cases
|
| 98 |
+
|
| 99 |
+
Real-time umpire replacement: Ball detection at 0.72 mAP is not reliable enough for official game calls
|
| 100 |
+
Non-broadcast footage: Amateur, youth league, or non-MLB footage is untested and likely to underperform
|
| 101 |
+
Non-standard camera setups: Any angle other than center-field broadcast will produce unreliable results
|