cfb40 / docs /timeout_ground_truth.md
andytaylor-smg's picture
timeout works now
eecfaf7
# Timeout Ground Truth
## Video: OSU vs Tenn 12.21.24.mkv
### Timeout Events (Chronological)
| Timestamp | Seconds | Team | Notes |
|-----------|---------|------|-------|
| 4:25 | 265 | HOME | First timeout of the game |
| 1:07:30 | 4050 | AWAY | |
| 1:09:40 | 4180 | AWAY | |
| 1:14:07 | 4447 | HOME | |
| 1:16:06 | 4566 | HOME | |
| 1:17:32 | - | - | **Halftime** - All timeouts hidden |
| 1:20:53 | - | - | Scorebug reappears, timeouts reset to 3 each |
| 1:44:54 | 6294 | AWAY | |
| 2:22:30 | - | - | **Game Over** - All timeouts hidden |
### Summary by Half
**First Half:**
- HOME: 3 timeouts used (4:25, 1:14:07, 1:16:06)
- AWAY: 2 timeouts used (1:07:30, 1:09:40)
**Second Half:**
- AWAY: 1 timeout used (1:44:54)
### Total Timeouts to Detect: 6
---
## v4 Baseline Timeout Tracker Performance
### Detected Timeouts (17 total from v4_baseline.json)
| Play # | Timestamp | Seconds | Team | Ground Truth Match |
|--------|-----------|---------|------|-------------------|
| 4 | 4:26 | 266 | HOME | βœ“ Matches 4:25 |
| 10 | 9:19 | 559 | HOME | βœ— False positive |
| 15 | 12:58 | 778 | HOME | βœ— False positive |
| 21 | 17:37 | 1057 | HOME | βœ— False positive |
| 35 | 29:05 | 1745 | HOME | βœ— False positive |
| 60 | 48:21 | 2901 | HOME | βœ— False positive |
| 68 | 55:04 | 3304 | HOME | βœ— False positive |
| 79 | 1:02:24 | 3744 | HOME | βœ— False positive |
| 91 | 1:11:54 | 4314 | HOME | βœ— False positive |
| 102 | 1:23:48 | 5028 | AWAY | βœ— False positive |
| 104 | 1:25:40 | 5140 | HOME | βœ— False positive |
| 111 | 1:30:35 | 5435 | HOME | βœ— False positive |
| 131 | 1:44:48 | 6288 | AWAY | βœ“ Matches 1:44:54 |
| 146 | 1:57:54 | 7074 | HOME | βœ— False positive |
| 151 | 2:01:50 | 7310 | HOME | βœ— False positive |
| 155 | 2:04:52 | 7492 | HOME | βœ— False positive |
| 175 | 2:18:51 | 8331 | AWAY | βœ— False positive |
### Ground Truth Comparison
| Ground Truth | Detected? | Notes |
|--------------|-----------|-------|
| 4:25 (HOME) | βœ“ Play 4 at 4:26 | 1 second off |
| 1:07:30 (AWAY) | βœ— MISSED | No detection near this time |
| 1:09:40 (AWAY) | βœ— MISSED | No detection near this time |
| 1:14:07 (HOME) | βœ— MISSED | No detection near this time |
| 1:16:06 (HOME) | βœ— MISSED | No detection near this time |
| 1:44:54 (AWAY) | βœ“ Play 131 at 1:44:48 | 6 seconds off |
### Performance Summary
- **True Positives**: 2 (detected correctly)
- **False Negatives**: 4 (missed real timeouts)
- **False Positives**: 15 (incorrectly flagged as timeout)
- **Recall**: 2/6 = **33%**
- **Precision**: 2/17 = **12%**
### Key Observations
1. **Most first-half timeouts missed**: 4 out of 5 first-half timeouts not detected
2. **High false positive rate**: 15 false positives, mostly flagged as HOME
3. **Timeout indicator region may be misconfigured**: The detector appears to trigger on 40->25 play clock transitions that aren't actual timeouts
4. **Second half better**: The one second-half timeout (1:44:54) was detected
---
## Root Cause Analysis
### How Timeout Detection Works
1. **Trigger**: Timeout detection is only triggered when a 40β†’25 play clock transition is detected
2. **Classification**: When 40β†’25 occurs, `classify_40_to_25_reset()` is called which:
- Compares current timeout counts with last known values via `check_timeout_change()`
- If `timeout_info.home_timeouts < last_home_timeouts` β†’ HOME timeout
- If `timeout_info.away_timeouts < last_away_timeouts` β†’ AWAY timeout
- Otherwise, classified as "special play" (punt/FG/XP)
### Why False Positives Occur
The `DetectTimeouts` class reads timeout indicators via bright pixel analysis in configured regions:
- **Config file**: `data/config/timeout_tracker_region.json`
- **Home region**: (1231, 972, 30, 49) - 30x49 pixel box
- **Away region**: (661, 973, 31, 47) - 31x47 pixel box
**Problems observed in isolation testing:**
- At 0:30 (start): Reads [False, False, False] for both teams (0 timeouts) - should be 3 each
- At 4:30: Reads [True, True, True] for both teams (3 timeouts) - wildly inconsistent
- Many "resets" and spurious transitions detected
**Likely causes:**
1. Region coordinates may be slightly off or need adjustment for this video
2. Brightness threshold (200) or ratio threshold (10%) may not be optimal
3. The timeout indicator ovals may have a different appearance than expected
---
## Updated Analysis (2026-01-09)
### Test Methodology Improvement
The initial v4 baseline test was reading timeout indicators **immediately at the 40β†’25 transition**. However, the timeout indicator on the scorebug updates with a **delay of 4-6 seconds** after the clock resets.
Updated test methodology:
1. Cached all 17,555 play clock readings at 2 fps
2. Identified 55 total 40β†’25 transitions
3. Compared timeout readings from 2s BEFORE to 2-6s AFTER each transition
### Validation Rules Implemented
A valid timeout must satisfy:
- **Exactly one team** decreases by **exactly 1**
- **Other team stays the same**
- **Confidence threshold**: Both before and after readings must have confidence >= 0.5
This filters out scorebug visibility issues where both teams' readings change:
- #5 at 9:19: (2,3)β†’(1,1) - both changed β†’ REJECTED
- #11 at 29:04: (2,3)β†’(0,0) - both changed β†’ REJECTED
- #35 at 1:30:35: (3,3)β†’(3,1) - away changed by 2 β†’ REJECTED
- #45 at 1:57:55: (3,2)β†’(0,2) - home changed by 3 β†’ REJECTED
- #32 at 1:23:48: (3,3)β†’(3,2) with low confidence (0.487) β†’ REJECTED
---
## βœ… IMPLEMENTED FIX (2026-01-09)
### Final Results (Pipeline Implementation)
| Metric | Value |
|--------|-------|
| **True Positives** | 6 |
| **False Positives** | 0 |
| **False Negatives** | 0 |
| **Recall** | **100%** |
| **Precision** | **100%** |
### Detected Timeouts (All Correct)
| Timestamp | Seconds | Team | Ground Truth |
|-----------|---------|------|--------------|
| 4:24 | 265 | HOME | βœ“ Matches 4:25 |
| 67:23 | 4044 | AWAY | βœ“ Matches 67:30 |
| 69:38 | 4178 | AWAY | βœ“ Matches 69:40 |
| 74:04 | 4444 | HOME | βœ“ Matches 74:07 |
| 76:03 | 4563 | HOME | βœ“ Matches 76:06 |
| 104:45 | 6285 | AWAY | βœ“ Matches 104:54 |
### Implementation Details
#### Key Changes Made:
1. **Delayed timeout check** (`TIMEOUT_CHECK_DELAY = 5.5s`):
- Store timeout reading when 40β†’25 transition detected
- Schedule check 5.5 seconds later
- Compare readings to detect change
2. **Validation logic**:
- Require exactly ONE team to decrease by exactly 1
- Other team's count must stay the same
- Both readings must have confidence >= 0.5
3. **Multiple entry points**:
- `classify_40_to_25_reset()` - handles 40β†’25 in PRE_SNAP state
- `check_possession_change()` - handles 40β†’25 during PLAY_IN_PROGRESS state
- Both now schedule delayed timeout checks
4. **Merger priority fix**:
- Changed from `normal > special > timeout` to `normal > timeout > special`
- Timeout plays now have higher priority than special plays
5. **Quiet time filter fix**:
- Only filter "special" plays in quiet time after normal plays
- Timeout plays can occur immediately after normal plays end
#### Files Modified:
- `src/tracking/play_identification_checks.py`
- `src/tracking/state_handlers.py`
- `src/tracking/play_state.py`
- `src/tracking/models.py`
- `src/tracking/play_merger.py`
- `src/tracking/clock_reset_identifier.py`
### Key Learnings
1. **Scorebug timeout indicator delay**: The indicator updates 4-6 seconds AFTER the play clock resets from 40β†’25, not immediately
2. **Confidence thresholds matter**: Low-confidence readings can cause false positives
3. **Validation rules essential**: Must verify exactly one team changes by exactly 1
4. **Multiple code paths**: 40β†’25 transitions can occur in both PRE_SNAP and PLAY_IN_PROGRESS states