cfb40 / docs /oregon_video_analysis.md
andytaylor-smg's picture
I think this fixes Oregon
aee009f

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

OSU vs Oregon Video Analysis - Play Detection Issues

Summary

The main.py algorithm works well on OSU vs Tenn but initially failed on OSU vs Oregon, detecting only 9-11 plays instead of the expected ~150 plays. This document explains the root cause and the fix implemented.

Problem Diagnosis

Initial Observation

Metric OSU vs Tenn OSU vs Oregon (before fix)
Frames with clock High 532 (3.1%)
Plays detected ~188 9-11
Detection consistency Stable Works first 12 min only

Time-Based Detection Analysis

Time Scorebug Detection Clock Detection Notes
0:00-5:00 0% 0% Pre-game content
5:30-11:40 95-100% 95-100% Works perfectly
12:30+ 90-100% 0-15% Fails

Critical Insight: The scorebug IS being detected at 90-100% rates throughout the game, but the play clock reading fails completely after ~12 minutes.

Visual Inspection

Comparing debug images of successful (early game) vs. failed (late game) readings:

  • The raw play clock digits look virtually identical
  • The preprocessed images also look nearly identical
  • Key difference: The play clock region has SHIFTED on screen

The configured region that perfectly captures the clock at 5:30 cuts off the bottom few pixels of the digits after ~12 minutes.

Root Cause

Play clock region shift: The play clock region physically shifts on screen during the broadcast (likely due to score display changes affecting the graphics layout). The fixed region coordinates that work early in the game no longer capture the digits correctly later.

This explains why:

  1. Scorebug detection works (it uses split detection which is more tolerant)
  2. Early game works (region is aligned correctly)
  3. Late game fails (region cuts off digit bottoms, causing template mismatch)

Solution: Padded Region Matching

Added padding to the play clock region extraction to handle small translational shifts:

  1. When extracting the play clock region, add N pixels of padding on all sides
  2. cv2.matchTemplate (already used) does sliding window search within the larger region
  3. This naturally handles small position shifts without requiring retry logic

Padding Test Results

Padding Late Game (15:00-40:00) Early Game (5:30-12:00)
0 (old) 8.9% detection 96.0% detection
3 98.9% detection 96.0% detection
4 98.9% detection (conf: 0.94) 96.0% detection

Implementation: padding=4 pixels applied in read_from_fixed_location()

Final Results

Video Before Fix After Fix
Oregon 9 plays (3.1% clock) 145 plays (64.8% clock)
Tennessee 188 plays 187 plays (no regression)

Files Modified

  • src/readers/playclock.py: Added padding parameter to read_from_fixed_location()
  • src/pipeline/play_extractor.py: Use padded region in fixed coordinates mode
  • src/pipeline/parallel.py: Use padded region in parallel processing mode

Debug Scripts

  • scripts/diagnose_oregon_playclock.py - Initial diagnosis
  • scripts/diagnose_oregon_gameplay.py - Gameplay segment analysis
  • scripts/diagnose_clock_gap.py - Time-based detection analysis
  • scripts/find_missed_clock_readings.py - Visual inspection of failures
  • scripts/find_successful_clock_readings.py - Visual inspection of successes
  • scripts/test_padded_playclock.py - Padding value comparison test

Debug Output Location

  • output/debug/missed_clock_readings/ - Failed readings with OCR confirmation
  • output/debug/successful_clock_readings/ - Successful readings for comparison
  • output/debug/padding_test_results.json - Padding test metrics

Key Insight

The fix was minimal because cv2.matchTemplate already does sliding window search. The problem was simply that the extracted region was too small - when digits shifted, they got cut off. Adding padding ensures the full digits are always contained, and the existing template matching finds them automatically.