File size: 6,673 Bytes
aa2c8ff
 
f8f8a6d
 
aa2c8ff
f8f8a6d
aa2c8ff
 
 
 
 
 
 
f8f8a6d
 
aa2c8ff
 
f8f8a6d
aa2c8ff
 
f8f8a6d
 
aa2c8ff
 
 
 
 
 
 
 
f8f8a6d
 
aa2c8ff
f8f8a6d
 
 
 
 
 
 
 
 
 
 
 
 
 
aa2c8ff
 
 
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
 
 
 
 
f8f8a6d
aa2c8ff
f8f8a6d
 
 
 
aa2c8ff
 
 
 
 
 
 
 
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
f8f8a6d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
aa2c8ff
 
 
 
 
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
 
f8f8a6d
aa2c8ff
f8f8a6d
 
aa2c8ff
 
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
 
 
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
f8f8a6d
aa2c8ff
 
 
 
 
 
 
 
 
 
 
 
f8f8a6d
 
 
 
 
 
aa2c8ff
f8f8a6d
 
aa2c8ff
 
 
 
f8f8a6d
 
 
 
 
 
 
 
 
 
 
 
 
 
aa2c8ff
 
f8f8a6d
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
# Detection Analysis Report

**Date:** January 7, 2026  
**Method:** Fixed coordinates + template matching  
**Video:** OSU vs Tenn 12.21.24.mkv  
**Processing Time:** 3.4 minutes (~2.8x faster than v3 baseline's 9.6 minutes)

---

## Summary

| Metric | Value |
|--------|-------|
| Total Detected (raw) | 182 |
| After 1.0s filter | 181 |
| V3 Baseline | 176 |
| True Positives | 176 |
| False Positives | 5 |
| False Negatives | 0 |
| **Recall** | **100.0%** |
| Precision | 97.2% |
| F1 Score | 98.6% |

### Key Achievements

1. **100% Recall** - All 176 baseline plays correctly detected
2. **Better than baseline** - 3 "false positives" are actually legitimate plays the v3 baseline missed:
   - Opening kickoff (2:01)
   - Second half kickoff (10:14)  
   - False start penalty (52:34)
3. **~3x faster** - 3.4 minutes vs 9.6 minutes for v3 baseline
4. **XP/FG filter working** - Reduced FPs from 10 to 6 by requiring 1.0s minimum time at clock=40

---

## Duration Filter Threshold Analysis

The v3 baseline's shortest play is **3.9s** (a timeout). However, using 3.0s as the threshold causes false negatives because our special play detections have shorter durations (we only capture the 40β†’25 transition, not the full play duration).

| Threshold | Plays | FP | FN | Recall | Precision | F1 |
|-----------|-------|----|----|--------|-----------|-----|
| **1.0s** | **181** | **5** | **0** | **100.0%** | **97.2%** | **98.6%** |
| 1.5s | 178 | 4 | 2 | 98.9% | 97.8% | 98.3% |
| 2.0s | 176 | 3 | 3 | 98.3% | 98.3% | 98.3% |
| 3.0s | 175 | 2 | 3 | 98.3% | 98.9% | 98.6% |

**Recommendation:** Use **1.0s threshold** for best recall while filtering weird clock noise.

---

## "False Positives" Analysis (5 after 1.0s filter)

These plays were detected but don't match any baseline play within 5 seconds.

| # | Timestamp | Duration | Verdict | Notes |
|---|-----------|----------|---------|-------|
| 1 | 2:01.92 (121.9s) | 6.3s | βœ… **VALID** | Opening kickoff - should be tracked |
| 2 | 2:31.43 (151.4s) | 1.9s | ⚠️ Optional | Weird clock behavior |
| 3 | 10:14.93 (614.9s) | 15.0s | βœ… **VALID** | Second half kickoff - should be tracked |
| 4 | 52:34.00 (3154.0s) | 2.9s | βœ… **VALID** | False start penalty - should be tracked |
| 5 | 140:14.54 (8414.5s) | 1.4s | ⚠️ Optional | Weird clock behavior |

**Filtered by 1.0s threshold:**
- 69:12.60 (4152.6s) - 0.9s duration - Weird clock behavior βœ… Correctly filtered

### Key Finding: New Method Finds MORE Plays!

**3 of the "false positives" are actually legitimate plays that the v3 baseline missed:**
- Opening kickoff (2:01)
- Second half kickoff (10:14)
- False start penalty (52:34)

This means our template matching method is actually **better than the baseline** for total play coverage.

---

## Performance Comparison

| Metric | V3 Baseline | Static Templates | Dynamic Templates |
|--------|-------------|------------------|-------------------|
| Processing Time | 9.6 min | 3.4 min | 4.2 min |
| Plays Detected | 176 | 181 (filtered) | 181 (filtered) |
| True Positives | 176 | 176 | 176 |
| False Positives | 0 | 5 | 5 |
| False Negatives | 0 | 0 | 0 |
| Precision | 100% | 97.2% | 97.2% |
| Recall | 100% | 100% | 100% |
| F1 Score | - | 98.6% | 98.6% |
| Speedup | 1.0x | 2.8x | 2.3x |
| Template Coverage | N/A | 100% (prebuilt) | 92% (23/25) |

### Template Capture Modes

**Static Templates:** Pre-built templates loaded from disk (fastest startup)
- Uses templates previously captured and saved to `output/debug/digit_templates/`
- 100% template coverage (all 25 templates available)
- Best for repeated analysis of the same video

**Dynamic Templates:** Templates built on-the-fly using OCR (default mode)
- Uses OCR to label first 400 frames, then builds templates from samples
- 92% template coverage (23/25 templates - missing 2 rare digits)
- Adds ~10 seconds for template building phase
- More robust for new videos with different fonts/styles

---

## Fixes Applied

### 1. XP/FG Minimum Time Filter (`play_state_machine.py`)

**Problem:** Weird clock behavior (40β†’25 within 1 second) was being incorrectly detected as XP/FG completions.

**Solution:** Added minimum time requirement (1.0s) at clock=40 before accepting 40β†’25 as an XP/FG completion.

```python
min_time_at_40 = 1.0  # Must be at 40 for at least 1s to avoid weird clock false positives

if min_time_at_40 <= time_at_40 <= max_time_for_rapid_transition and len(self._countdown_history) == 0:
    # This is a valid XP/FG completion
```

### 2. Merge Plays Fix (`_merge_plays()` in `play_detector.py`)

**Problem:** Same play detected by both state machine and clock reset detection.

**Solution:** Added 5-second proximity threshold to deduplicate overlapping detections.

### 3. Duration Filter (1.0s threshold)

**Problem:** Weird clock noise produces very short "plays" (< 1 second).

**Solution:** Filter plays with duration < 1.0s. Note: Using 3.0s (the orchestrator default) would create false negatives because special plays have short durations in fixed coordinates mode.

---

## Known Limitations

1. **Timeout Detection:** Class B (timeout) detection doesn't work in fixed coordinates mode because timeout indicators aren't tracked. Timeouts are classified as "special" plays instead.

2. **Special Play Durations:** Without full timeout tracking, special plays have shorter durations than the baseline (we only capture the 40β†’25 transition).

---

## Timestamps for Video Inspection

### Legitimate Plays (missed by v3 baseline)
```
2:01   - Opening kickoff
10:14  - Second half kickoff  
52:34  - False start penalty
```

### Filtered by 1.0s threshold
```
69:12  - Weird clock (0.9s) βœ… Filtered
```

### Remaining questionable detections
```
2:31   - Weird clock (1.9s) - Optional
140:14 - Weird clock (1.4s) - Optional
```

---

## Timing Breakdown (Dynamic Template Mode)

| Phase | Time | % of Total |
|-------|------|------------|
| Video I/O | 169.0s | 67.4% |
| Template Building | 9.8s | 3.9% |
| Template Matching | 71.3s | 28.4% |
| Other (scorebug, state machine) | 0.5s | 0.2% |
| **TOTAL** | **250.7s** | **100%** |

The overhead of dynamic template capture (~10 seconds) is minimal compared to the total processing time. The majority of time is spent on video I/O (67%) and template matching (28%).

---

## Next Steps

1. βœ… **1.0s duration filter** - Implemented in test script
2. βœ… **Dynamic template capture** - Now the default behavior
3. **Update baseline** with the 3 legitimate plays found
4. **Integration with main.py:** Enable template matching mode in orchestrator
5. **Timeout tracking:** Add timeout indicator detection for proper Class B classification