Taiko404 commited on
Commit
0385396
·
verified ·
1 Parent(s): bca13bc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +80 -0
README.md CHANGED
@@ -19,3 +19,83 @@ Broadcast overlay systems displaying a real-time model-driven strike zone visual
19
  Umpire analytics platforms comparing model calls vs. official calls per game
20
  Research and development for automated ball-strike officiating systems
21
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
  Umpire analytics platforms comparing model calls vs. official calls per game
20
  Research and development for automated ball-strike officiating systems
21
 
22
+ 2. Training Data
23
+ Dataset Source
24
+ Base dataset: ROBO ump Roboflow Universe, accessed 2025
25
+ https://universe.roboflow.com/toasty-workspace/roboump
26
+
27
+ Platform: Roboflow Universe (roboflow.com/universe)
28
+ Collection: Broadcast MLB footage (center-field fixed-angle camera)
29
+ Resolution: 640 × 640 px (resized)
30
+
31
+ Class Distribution
32
+
33
+ ball: ~551 images — 70 / 20 / 10 split
34
+ batter: ~774 images — 70 / 20 / 10 split
35
+ pitcher: ~774 images — 70 / 20 / 10 split
36
+ strike_zone: ~300 images — 70 / 20 / 10 split
37
+
38
+ Annotation Process
39
+ The base dataset provided semi-automated annotations for ball, batter, and pitcher. Strike zone annotations were not included and were added entirely through manual labeling approximately 300 strike zones annotated in Roboflow, defined as the rectangular region from the batter's knees to the midpoint of the torso.
40
+ A 10% quality review found false positives in the ball class (crowd balls, logos, advertising). Around 5% of ball annotations were removed. Roughly 3 hours of total manual correction work was spent tightening boxes and adding missed detections across all classes.
41
+ Data Augmentation
42
+
43
+ Horizontal flip (50% probability)
44
+ Color augmentation and saturation jitter
45
+
46
+ 3. Training Procedure
47
+
48
+ Framework: Ultralytics
49
+ Hardware: GPU
50
+ Batch size: Default (auto-tuned)
51
+ Epochs: 50
52
+ Image size: 640
53
+ Early stopping: Not applied
54
+ Preprocessing: Auto-resize to 640 × 640, normalization
55
+
56
+ 4. Evaluation Results
57
+ Overall Metrics
58
+
59
+ mAP@50: 0.92 (target > 0.85)
60
+ Overall F1: 0.91 (target >= 0.80)
61
+
62
+ Per-Class Breakdown
63
+
64
+ ball: mAP@50 = 0.72, F1 = 0.76
65
+ batter: mAP@50 = 0.97, F1 = 0.95
66
+ pitcher: mAP@50 = 0.97, F1 = 0.84
67
+ strike_zone: mAP@50 = 0.82, F1 = 0.88
68
+
69
+ Confusion Matrix Summary
70
+
71
+ Ball: 69 correctly predicted, 33 missed as background (~32% miss rate)
72
+ Batter: 145/145 correct
73
+ Pitcher: 145/145 correct
74
+ Strike_zone: 44/47 correct, 3 missed as background
75
+ Background false positives: 9 labeled as ball, 4 as strike_zone
76
+
77
+ Performance Analysis
78
+ The overall mAP@50 of 0.92 exceeds the 0.85 target and F1 of 0.91 clears the 0.80 threshold. Batter and pitcher detection is near-perfect, reflecting their large consistent silhouettes. Training and validation loss curves show steady convergence with no signs of overfitting.
79
+ The ball class is the most critical failure point. A mAP@50 of 0.72 means roughly 1 in 4 pitches in flight is missed or incorrectly detected. Since ball is the most important class for pitch classification, this gap matters more than the strong overall score suggests. The strike_zone class achieved F1 = 0.88 despite having the fewest training instances, though its mAP@50 of 0.82 falls just short of target.
80
+
81
+ 5. Limitations and Biases
82
+ Known Failure Cases
83
+
84
+ Ball near white uniforms: Low contrast causes false negatives when the ball overlaps light-colored jerseys
85
+ Strike zone partially off-frame: Bounding box gets truncated when the batter stands near the frame edge
86
+ Occlusion: When the catcher, batter, or pitcher obscures the ball at the plate, recall drops significantly
87
+ Low-light or compressed frames: JPEG artifacts confuse the detector, especially for the small ball class
88
+
89
+ Poor-Performing Class: Ball
90
+ The ball class is the weakest performer (mAP@50 = 0.72) because the ball is typically only 10-20 pixels across at broadcast resolution and moves at 80-100 mph, causing motion blur in standard 30fps footage. Background crowd balls and logos also created false positives requiring manual cleanup.
91
+ Contextual Limitations
92
+
93
+ Fixed camera angle required: The model only works on center-field broadcast footage and will not generalize to other angles
94
+ No 3D depth: A 2D bounding box cannot confirm a pitch actually crossed through the zone, only that it was near it
95
+ Strike zone variability: The zone was manually estimated per frame and varies by batter height, introducing annotator variability
96
+
97
+ Inappropriate Use Cases
98
+
99
+ Real-time umpire replacement: Ball detection at 0.72 mAP is not reliable enough for official game calls
100
+ Non-broadcast footage: Amateur, youth league, or non-MLB footage is untested and likely to underperform
101
+ Non-standard camera setups: Any angle other than center-field broadcast will produce unreliable results