Spaces:

jw-tools
/

jw-search

Running

App Files Files Community

jw-search / docs /TESTING_GUIDE.md

jw-tools

deploy: latest main (lazy-ML cold start, durable launcher, web-image search, scene search) + full-app data refresh

7ea1851 verified about 10 hours ago

preview code

raw

history blame contribute delete

11.8 kB

Testing Guide: Phase 1-3 Implementation

This guide covers all features implemented in Phases 1-3 of the video processing enhancement project. Use this document to validate each feature works correctly before processing your 3,000+ video library.

Overview of Changes

Phase 1: Database Schema & Model Versioning

Model version tracking (SigLIP, ArcFace, labels hash)
New database columns for processing metadata
FTS5 to regular table migration for proper score sorting

Phase 2: Processing Improvements

Category-specific scene detection thresholds
Enhanced hybrid approach (scene + interval fallback)
Face crop size increased to 256x256 pixels

Phase 3: Settings UI

Scene threshold controls in Settings page
Preset configurations (static, moderate, dynamic)
Minimum thumbnails configuration

Test 1: Backend Settings Module

1.1 Verify Presets Load Correctly

cd /home/user/Search-UI/backend
python3 -c "
import settings
print('Presets:')
for name, config in settings.PRESET_SETTINGS.items():
    print(f'  {name}: threshold={config[\"scene_threshold\"]}, interval={config[\"thumbnail_interval\"]}s, min={config[\"min_thumbnails\"]}')
print(f'\nDefaults: {settings.get_defaults()}')
"

Expected Output:

Presets:
  static: threshold=0.12, interval=10.0s, min=30
  moderate: threshold=0.2, interval=5.0s, min=35
  dynamic: threshold=0.35, interval=2.0s, min=50

Defaults: {'scene_threshold': 0.29, 'thumbnail_interval': 3.0, 'min_thumbnails': 30}

1.2 Test get_processing_settings()

cd /home/user/Search-UI/backend
python3 -c "
import settings

print(f'Default: {settings.get_processing_settings()}')
settings.set_scene_threshold(0.35)
print(f'Updated: {settings.get_processing_settings()}')
"

Expected Output:

Default returns scene_threshold=0.2
Updated returns scene_threshold=0.35

Test 2: Search Images Model Versioning

2.1 Verify Version Constants

cd /home/user/Search-UI/backend
python3 -c "
# Check the source file for constants
with open('search_images.py', 'r') as f:
    content = f.read()

import re
version_match = re.search(r'SIGLIP_MODEL_VERSION = \"([^\"]+)\"', content)
threshold_match = re.search(r'CLASSIFICATION_THRESHOLD = ([0-9.]+)', content)
print(f'SIGLIP_MODEL_VERSION: {version_match.group(1) if version_match else \"NOT FOUND\"}')
print(f'CLASSIFICATION_THRESHOLD: {threshold_match.group(1) if threshold_match else \"NOT FOUND\"}')
print('get_labels_version(): defined' if 'def get_labels_version()' in content else 'get_labels_version(): NOT FOUND')
"

Expected Output:

SIGLIP_MODEL_VERSION: siglip2-so400m-patch16-naflex-v1
CLASSIFICATION_THRESHOLD: 0.1
get_labels_version(): defined

2.2 Verify Database Schema (processed_videos table)

cd /home/user/Search-UI/backend
python3 -c "
from database import get_db
conn = get_db()
cursor = conn.execute('PRAGMA table_info(processed_videos)')
columns = [row[1] for row in cursor.fetchall()]
required = ['siglip_model_version', 'labels_version', 'classification_threshold',
            'resolution_label', 'scene_threshold', 'duration_seconds', 'fps', 'total_frames']
for col in required:
    status = '✓' if col in columns else '✗'
    print(f'{status} {col}')
"

Expected: All columns marked with ✓

2.3 Verify image_categories is Regular Table (not FTS5)

cd /home/user/Search-UI/backend
python3 -c "
from database import get_db
conn = get_db()

# Check if it's a regular table (not FTS5 virtual table)
cursor = conn.execute(\"SELECT sql FROM sqlite_master WHERE name='image_categories'\")
row = cursor.fetchone()
if row:
    sql = row[0]
    if 'VIRTUAL TABLE' in sql.upper():
        print('✗ Still FTS5 virtual table')
    else:
        print('✓ Regular table')
        print(f'  Schema: {sql[:100]}...')
else:
    print('✗ Table not found')

# Check indexes
cursor = conn.execute(\"SELECT name FROM sqlite_master WHERE type='index' AND tbl_name='image_categories'\")
indexes = [row[0] for row in cursor.fetchall()]
print(f'  Indexes: {indexes}')
"

Expected:

Regular table (not virtual)
Indexes include: idx_categories_natural_key, idx_categories_category_name, idx_categories_score

Test 3: Face Search Constants

3.1 Verify Constants

cd /home/user/Search-UI/backend
python3 -c "
with open('face_search.py', 'r') as f:
    content = f.read()

import re
crop_match = re.search(r'FACE_CROP_SIZE = (\d+)', content)
conf_match = re.search(r'MIN_FACE_CONFIDENCE = ([0-9.]+)', content)
version_match = re.search(r'ARCFACE_MODEL_VERSION = \"([^\"]+)\"', content)

print(f'FACE_CROP_SIZE: {crop_match.group(1) if crop_match else \"NOT FOUND\"} (expected: 256)')
print(f'MIN_FACE_CONFIDENCE: {conf_match.group(1) if conf_match else \"NOT FOUND\"}')
print(f'ARCFACE_MODEL_VERSION: {version_match.group(1) if version_match else \"NOT FOUND\"}')
"

Expected:

FACE_CROP_SIZE: 256 (expected: 256)
MIN_FACE_CONFIDENCE: 0.5
ARCFACE_MODEL_VERSION: arcface-mtcnn-v1

Test 4: Process Video Integration

4.1 Verify Scene Threshold Usage

cd /home/user/Search-UI/backend
grep -n "scene_threshold = processing_settings" process_video.py
grep -n "gt(scene" process_video.py

Expected: Both patterns found, showing scene_threshold is used dynamically from settings.

4.2 Verify FACE_CROP_SIZE Usage

cd /home/user/Search-UI/backend
grep -n "face_search.FACE_CROP_SIZE" process_video.py

Expected: Found in process_video.py (not hardcoded 128)

4.3 Verify Enhanced Hybrid Approach

cd /home/user/Search-UI/backend
grep -n "fallback_threshold" process_video.py
grep -n "int(min_thumbnails \* 0.5)" process_video.py

Expected: 50% threshold used for fallback decision

Test 5: API Endpoints

5.1 Start Backend Server

cd /home/user/Search-UI
tmux new -d -s backend "cd backend && source venv/bin/activate && uvicorn main:app --reload --host 0.0.0.0 2>&1 | tee ../backend.log"
sleep 3
tail -5 backend.log

Expected: Server running on http://0.0.0.0:8000

5.2 Test GET /api/settings/series

curl -s "http://localhost:8000/api/settings/series?language=E" | python3 -m json.tool | head -50

Expected Response Structure:

{
    "settings": [...],
    "available_categories": [...],
    "presets": {
        "static": {"scene_threshold": 0.12, ...},
        "moderate": {...},
        "dynamic": {...}
    },
    "defaults": {
        "scene_threshold": 0.29,
        "thumbnail_interval": 3.0,
        "min_thumbnails": 30
    }
}

5.3 Test POST /api/settings/series

# Add a test setting
curl -X POST "http://localhost:8000/api/settings/series?category=TestDrama&thumbnail_interval=2.0&scene_threshold=0.35&min_thumbnails=50&preset=dynamic"

# Verify it was added
curl -s "http://localhost:8000/api/settings/series?language=E" | python3 -c "
import sys, json
data = json.load(sys.stdin)
for s in data.get('settings', []):
    if s.get('category') == 'TestDrama':
        print(f'Found: {s}')
        break
else:
    print('Not found')
"

# Clean up
curl -s "http://localhost:8000/api/settings/series?language=E" | python3 -c "
import sys, json
data = json.load(sys.stdin)
for s in data.get('settings', []):
    if s.get('category') == 'TestDrama':
        print(f'Setting ID to delete: {s[\"id\"]}')
"
# Delete using the ID from above
# curl -X DELETE "http://localhost:8000/api/settings/series/{id}"

Test 6: Frontend Settings Page

6.1 Start Frontend Server

cd /home/user/Search-UI
tmux new -d -s frontend "cd frontend && npm run dev -- --host 0.0.0.0 2>&1 | tee ../frontend.log"
sleep 5
tail -5 frontend.log

6.2 Visual Testing Checklist

Open browser to http://localhost:5173 (or your network IP:5173)

Navigate to Settings page and verify:

Presets Section displays all three presets with values:
- Static: Threshold 0.12, Interval 10s, Min 30
- Moderate: Threshold 0.20, Interval 5s, Min 35
- Dynamic: Threshold 0.35, Interval 2s, Min 50
- Default: Shows current defaults
Add Series Setting Form has:
- Category dropdown (populated from available categories)
- Subcategory dropdown (optional)
- Preset dropdown (static, moderate, dynamic, or custom)
- Scene Threshold input (0.05-0.5, step 0.01)
- Fallback Interval input (0.5-60, step 0.5)
- Min Thumbnails input (10-200, step 5)
Preset Selection auto-fills values:
- Select "Static" preset → threshold=0.12, interval=10, min=30
- Select "Dynamic" preset → threshold=0.35, interval=2, min=50
- Custom settings inputs become disabled when preset selected
Current Settings Table shows columns:
- Category | Subcategory | Threshold | Interval | Min | Preset | Actions
Add/Delete works:
- Add a test category with dynamic preset
- Verify it appears in table with correct values
- Delete it and verify it's removed

Test 7: End-to-End Video Processing

7.1 Test with Single Video

Prerequisites: Have a test video in the system with known category.

# Process a single video with explicit settings
curl -X POST "http://localhost:8000/api/process-video?natural_key=YOUR_TEST_KEY&label=480p"

7.2 Verify Processing Metadata Saved

cd /home/user/Search-UI/backend
python3 -c "
from database import get_db
conn = get_db()
cursor = conn.execute('''
    SELECT natural_key, resolution_label, scene_threshold,
           siglip_model_version, labels_version, duration_seconds
    FROM processed_videos
    ORDER BY processed_at DESC
    LIMIT 5
''')
print('Recent processed videos:')
for row in cursor.fetchall():
    print(f'  {row}')
"

Expected: Shows processing metadata including scene_threshold, model versions

7.3 Verify Thumbnail Count

After processing, check the thumbnails directory:

# Replace with actual natural_key and label
ls -la /home/user/Search-UI/backend/videos/{natural_key}/{label}/thumbnails/ | wc -l

Expected: Number of thumbnails appropriate for the content type and settings

Automated Test Script

Run the comprehensive test script:

cd /home/user/Search-UI
python3 scratchpad/2025-12-25-1708-test-phase1-3.py

Expected: All 6 modules pass (6/6)

Known Limitations

Phase 4 (Deferred)

The Unified Person system (combining face + speaker recognition) requires:

ECAPA-TDNN model integration for speaker embeddings
New Person management UI
Cross-reference system for face ↔ speaker associations
This is planned for future implementation

Phase 5 (Deferred)

Validation and performance improvements:

Bulk reprocessing detection
Performance profiling
These will be addressed after Phase 4

Troubleshooting

Backend Won't Start

# Check for Python environment issues
cd /home/user/Search-UI/backend
source venv/bin/activate
pip list | grep -E "fastapi|uvicorn|sqlite-vec"

Database Migration Issues

# Force recreate tables (WARNING: loses data)
cd /home/user/Search-UI/backend
python3 -c "
from database import get_db
conn = get_db()
# Drop and recreate problematic tables
conn.execute('DROP TABLE IF EXISTS image_categories')
# Re-run the app to recreate
"

Frontend Build Issues

cd /home/user/Search-UI/frontend
npm install
npm run build

Summary Checklist

Before processing your video library, verify:

All automated tests pass (6/6)
Backend API returns presets and defaults correctly
Frontend Settings page displays and functions correctly
Test video processes with correct scene threshold
Processing metadata saved to database
Thumbnail count appropriate for content type