Spaces:
Sleeping
Sleeping
| # Timeout Ground Truth | |
| ## Video: OSU vs Tenn 12.21.24.mkv | |
| ### Timeout Events (Chronological) | |
| | Timestamp | Seconds | Team | Notes | | |
| |-----------|---------|------|-------| | |
| | 4:25 | 265 | HOME | First timeout of the game | | |
| | 1:07:30 | 4050 | AWAY | | | |
| | 1:09:40 | 4180 | AWAY | | | |
| | 1:14:07 | 4447 | HOME | | | |
| | 1:16:06 | 4566 | HOME | | | |
| | 1:17:32 | - | - | **Halftime** - All timeouts hidden | | |
| | 1:20:53 | - | - | Scorebug reappears, timeouts reset to 3 each | | |
| | 1:44:54 | 6294 | AWAY | | | |
| | 2:22:30 | - | - | **Game Over** - All timeouts hidden | | |
| ### Summary by Half | |
| **First Half:** | |
| - HOME: 3 timeouts used (4:25, 1:14:07, 1:16:06) | |
| - AWAY: 2 timeouts used (1:07:30, 1:09:40) | |
| **Second Half:** | |
| - AWAY: 1 timeout used (1:44:54) | |
| ### Total Timeouts to Detect: 6 | |
| --- | |
| ## v4 Baseline Timeout Tracker Performance | |
| ### Detected Timeouts (17 total from v4_baseline.json) | |
| | Play # | Timestamp | Seconds | Team | Ground Truth Match | | |
| |--------|-----------|---------|------|-------------------| | |
| | 4 | 4:26 | 266 | HOME | β Matches 4:25 | | |
| | 10 | 9:19 | 559 | HOME | β False positive | | |
| | 15 | 12:58 | 778 | HOME | β False positive | | |
| | 21 | 17:37 | 1057 | HOME | β False positive | | |
| | 35 | 29:05 | 1745 | HOME | β False positive | | |
| | 60 | 48:21 | 2901 | HOME | β False positive | | |
| | 68 | 55:04 | 3304 | HOME | β False positive | | |
| | 79 | 1:02:24 | 3744 | HOME | β False positive | | |
| | 91 | 1:11:54 | 4314 | HOME | β False positive | | |
| | 102 | 1:23:48 | 5028 | AWAY | β False positive | | |
| | 104 | 1:25:40 | 5140 | HOME | β False positive | | |
| | 111 | 1:30:35 | 5435 | HOME | β False positive | | |
| | 131 | 1:44:48 | 6288 | AWAY | β Matches 1:44:54 | | |
| | 146 | 1:57:54 | 7074 | HOME | β False positive | | |
| | 151 | 2:01:50 | 7310 | HOME | β False positive | | |
| | 155 | 2:04:52 | 7492 | HOME | β False positive | | |
| | 175 | 2:18:51 | 8331 | AWAY | β False positive | | |
| ### Ground Truth Comparison | |
| | Ground Truth | Detected? | Notes | | |
| |--------------|-----------|-------| | |
| | 4:25 (HOME) | β Play 4 at 4:26 | 1 second off | | |
| | 1:07:30 (AWAY) | β MISSED | No detection near this time | | |
| | 1:09:40 (AWAY) | β MISSED | No detection near this time | | |
| | 1:14:07 (HOME) | β MISSED | No detection near this time | | |
| | 1:16:06 (HOME) | β MISSED | No detection near this time | | |
| | 1:44:54 (AWAY) | β Play 131 at 1:44:48 | 6 seconds off | | |
| ### Performance Summary | |
| - **True Positives**: 2 (detected correctly) | |
| - **False Negatives**: 4 (missed real timeouts) | |
| - **False Positives**: 15 (incorrectly flagged as timeout) | |
| - **Recall**: 2/6 = **33%** | |
| - **Precision**: 2/17 = **12%** | |
| ### Key Observations | |
| 1. **Most first-half timeouts missed**: 4 out of 5 first-half timeouts not detected | |
| 2. **High false positive rate**: 15 false positives, mostly flagged as HOME | |
| 3. **Timeout indicator region may be misconfigured**: The detector appears to trigger on 40->25 play clock transitions that aren't actual timeouts | |
| 4. **Second half better**: The one second-half timeout (1:44:54) was detected | |
| --- | |
| ## Root Cause Analysis | |
| ### How Timeout Detection Works | |
| 1. **Trigger**: Timeout detection is only triggered when a 40β25 play clock transition is detected | |
| 2. **Classification**: When 40β25 occurs, `classify_40_to_25_reset()` is called which: | |
| - Compares current timeout counts with last known values via `check_timeout_change()` | |
| - If `timeout_info.home_timeouts < last_home_timeouts` β HOME timeout | |
| - If `timeout_info.away_timeouts < last_away_timeouts` β AWAY timeout | |
| - Otherwise, classified as "special play" (punt/FG/XP) | |
| ### Why False Positives Occur | |
| The `DetectTimeouts` class reads timeout indicators via bright pixel analysis in configured regions: | |
| - **Config file**: `data/config/timeout_tracker_region.json` | |
| - **Home region**: (1231, 972, 30, 49) - 30x49 pixel box | |
| - **Away region**: (661, 973, 31, 47) - 31x47 pixel box | |
| **Problems observed in isolation testing:** | |
| - At 0:30 (start): Reads [False, False, False] for both teams (0 timeouts) - should be 3 each | |
| - At 4:30: Reads [True, True, True] for both teams (3 timeouts) - wildly inconsistent | |
| - Many "resets" and spurious transitions detected | |
| **Likely causes:** | |
| 1. Region coordinates may be slightly off or need adjustment for this video | |
| 2. Brightness threshold (200) or ratio threshold (10%) may not be optimal | |
| 3. The timeout indicator ovals may have a different appearance than expected | |
| --- | |
| ## Updated Analysis (2026-01-09) | |
| ### Test Methodology Improvement | |
| The initial v4 baseline test was reading timeout indicators **immediately at the 40β25 transition**. However, the timeout indicator on the scorebug updates with a **delay of 4-6 seconds** after the clock resets. | |
| Updated test methodology: | |
| 1. Cached all 17,555 play clock readings at 2 fps | |
| 2. Identified 55 total 40β25 transitions | |
| 3. Compared timeout readings from 2s BEFORE to 2-6s AFTER each transition | |
| ### Validation Rules Implemented | |
| A valid timeout must satisfy: | |
| - **Exactly one team** decreases by **exactly 1** | |
| - **Other team stays the same** | |
| - **Confidence threshold**: Both before and after readings must have confidence >= 0.5 | |
| This filters out scorebug visibility issues where both teams' readings change: | |
| - #5 at 9:19: (2,3)β(1,1) - both changed β REJECTED | |
| - #11 at 29:04: (2,3)β(0,0) - both changed β REJECTED | |
| - #35 at 1:30:35: (3,3)β(3,1) - away changed by 2 β REJECTED | |
| - #45 at 1:57:55: (3,2)β(0,2) - home changed by 3 β REJECTED | |
| - #32 at 1:23:48: (3,3)β(3,2) with low confidence (0.487) β REJECTED | |
| --- | |
| ## β IMPLEMENTED FIX (2026-01-09) | |
| ### Final Results (Pipeline Implementation) | |
| | Metric | Value | | |
| |--------|-------| | |
| | **True Positives** | 6 | | |
| | **False Positives** | 0 | | |
| | **False Negatives** | 0 | | |
| | **Recall** | **100%** | | |
| | **Precision** | **100%** | | |
| ### Detected Timeouts (All Correct) | |
| | Timestamp | Seconds | Team | Ground Truth | | |
| |-----------|---------|------|--------------| | |
| | 4:24 | 265 | HOME | β Matches 4:25 | | |
| | 67:23 | 4044 | AWAY | β Matches 67:30 | | |
| | 69:38 | 4178 | AWAY | β Matches 69:40 | | |
| | 74:04 | 4444 | HOME | β Matches 74:07 | | |
| | 76:03 | 4563 | HOME | β Matches 76:06 | | |
| | 104:45 | 6285 | AWAY | β Matches 104:54 | | |
| ### Implementation Details | |
| #### Key Changes Made: | |
| 1. **Delayed timeout check** (`TIMEOUT_CHECK_DELAY = 5.5s`): | |
| - Store timeout reading when 40β25 transition detected | |
| - Schedule check 5.5 seconds later | |
| - Compare readings to detect change | |
| 2. **Validation logic**: | |
| - Require exactly ONE team to decrease by exactly 1 | |
| - Other team's count must stay the same | |
| - Both readings must have confidence >= 0.5 | |
| 3. **Multiple entry points**: | |
| - `classify_40_to_25_reset()` - handles 40β25 in PRE_SNAP state | |
| - `check_possession_change()` - handles 40β25 during PLAY_IN_PROGRESS state | |
| - Both now schedule delayed timeout checks | |
| 4. **Merger priority fix**: | |
| - Changed from `normal > special > timeout` to `normal > timeout > special` | |
| - Timeout plays now have higher priority than special plays | |
| 5. **Quiet time filter fix**: | |
| - Only filter "special" plays in quiet time after normal plays | |
| - Timeout plays can occur immediately after normal plays end | |
| #### Files Modified: | |
| - `src/tracking/play_identification_checks.py` | |
| - `src/tracking/state_handlers.py` | |
| - `src/tracking/play_state.py` | |
| - `src/tracking/models.py` | |
| - `src/tracking/play_merger.py` | |
| - `src/tracking/clock_reset_identifier.py` | |
| ### Key Learnings | |
| 1. **Scorebug timeout indicator delay**: The indicator updates 4-6 seconds AFTER the play clock resets from 40β25, not immediately | |
| 2. **Confidence thresholds matter**: Low-confidence readings can cause false positives | |
| 3. **Validation rules essential**: Must verify exactly one team changes by exactly 1 | |
| 4. **Multiple code paths**: 40β25 transitions can occur in both PRE_SNAP and PLAY_IN_PROGRESS states | |