Spaces:

jw-tools
/

jw-search

Running

App Files Files Community

jw-search / docs /TESTING_GUIDE.md

jw-tools

deploy: latest main (lazy-ML cold start, durable launcher, web-image search, scene search) + full-app data refresh

7ea1851 verified about 12 hours ago

preview code

raw

history blame contribute delete

11.8 kB

	# Testing Guide: Phase 1-3 Implementation

	This guide covers all features implemented in Phases 1-3 of the video processing enhancement project. Use this document to validate each feature works correctly before processing your 3,000+ video library.

	## Overview of Changes

	### Phase 1: Database Schema & Model Versioning
	- Model version tracking (SigLIP, ArcFace, labels hash)
	- New database columns for processing metadata
	- FTS5 to regular table migration for proper score sorting

	### Phase 2: Processing Improvements
	- Category-specific scene detection thresholds
	- Enhanced hybrid approach (scene + interval fallback)
	- Face crop size increased to 256x256 pixels

	### Phase 3: Settings UI
	- Scene threshold controls in Settings page
	- Preset configurations (static, moderate, dynamic)
	- Minimum thumbnails configuration

	---

	## Test 1: Backend Settings Module

	### 1.1 Verify Presets Load Correctly

	```bash
	cd /home/user/Search-UI/backend
	python3 -c "
	import settings
	print('Presets:')
	for name, config in settings.PRESET_SETTINGS.items():
	print(f' {name}: threshold={config[\"scene_threshold\"]}, interval={config[\"thumbnail_interval\"]}s, min={config[\"min_thumbnails\"]}')
	print(f'\nDefaults: {settings.get_defaults()}')
	"
	```

	Expected Output:
	```
	Presets:
	static: threshold=0.12, interval=10.0s, min=30
	moderate: threshold=0.2, interval=5.0s, min=35
	dynamic: threshold=0.35, interval=2.0s, min=50

	Defaults: {'scene_threshold': 0.29, 'thumbnail_interval': 3.0, 'min_thumbnails': 30}
	```

	### 1.2 Test get_processing_settings()

	```bash
	cd /home/user/Search-UI/backend
	python3 -c "
	import settings

	print(f'Default: {settings.get_processing_settings()}')
	settings.set_scene_threshold(0.35)
	print(f'Updated: {settings.get_processing_settings()}')
	"
	```

	Expected Output:
	- Default returns `scene_threshold=0.2`
	- Updated returns `scene_threshold=0.35`

	---

	## Test 2: Search Images Model Versioning

	### 2.1 Verify Version Constants

	```bash
	cd /home/user/Search-UI/backend
	python3 -c "
	# Check the source file for constants
	with open('search_images.py', 'r') as f:
	content = f.read()

	import re
	version_match = re.search(r'SIGLIP_MODEL_VERSION = \"([^\"]+)\"', content)
	threshold_match = re.search(r'CLASSIFICATION_THRESHOLD = ([0-9.]+)', content)
	print(f'SIGLIP_MODEL_VERSION: {version_match.group(1) if version_match else \"NOT FOUND\"}')
	print(f'CLASSIFICATION_THRESHOLD: {threshold_match.group(1) if threshold_match else \"NOT FOUND\"}')
	print('get_labels_version(): defined' if 'def get_labels_version()' in content else 'get_labels_version(): NOT FOUND')
	"
	```

	Expected Output:
	```
	SIGLIP_MODEL_VERSION: siglip2-so400m-patch16-naflex-v1
	CLASSIFICATION_THRESHOLD: 0.1
	get_labels_version(): defined
	```

	### 2.2 Verify Database Schema (processed_videos table)

	```bash
	cd /home/user/Search-UI/backend
	python3 -c "
	from database import get_db
	conn = get_db()
	cursor = conn.execute('PRAGMA table_info(processed_videos)')
	columns = [row[1] for row in cursor.fetchall()]
	required = ['siglip_model_version', 'labels_version', 'classification_threshold',
	'resolution_label', 'scene_threshold', 'duration_seconds', 'fps', 'total_frames']
	for col in required:
	status = '✓' if col in columns else '✗'
	print(f'{status} {col}')
	"
	```

	Expected: All columns marked with ✓

	### 2.3 Verify image_categories is Regular Table (not FTS5)

	```bash
	cd /home/user/Search-UI/backend
	python3 -c "
	from database import get_db
	conn = get_db()

	# Check if it's a regular table (not FTS5 virtual table)
	cursor = conn.execute(\"SELECT sql FROM sqlite_master WHERE name='image_categories'\")
	row = cursor.fetchone()
	if row:
	sql = row[0]
	if 'VIRTUAL TABLE' in sql.upper():
	print('✗ Still FTS5 virtual table')
	else:
	print('✓ Regular table')
	print(f' Schema: {sql[:100]}...')
	else:
	print('✗ Table not found')

	# Check indexes
	cursor = conn.execute(\"SELECT name FROM sqlite_master WHERE type='index' AND tbl_name='image_categories'\")
	indexes = [row[0] for row in cursor.fetchall()]
	print(f' Indexes: {indexes}')
	"
	```

	Expected:
	- Regular table (not virtual)
	- Indexes include: idx_categories_natural_key, idx_categories_category_name, idx_categories_score

	---

	## Test 3: Face Search Constants

	### 3.1 Verify Constants

	```bash
	cd /home/user/Search-UI/backend
	python3 -c "
	with open('face_search.py', 'r') as f:
	content = f.read()

	import re
	crop_match = re.search(r'FACE_CROP_SIZE = (\d+)', content)
	conf_match = re.search(r'MIN_FACE_CONFIDENCE = ([0-9.]+)', content)
	version_match = re.search(r'ARCFACE_MODEL_VERSION = \"([^\"]+)\"', content)

	print(f'FACE_CROP_SIZE: {crop_match.group(1) if crop_match else \"NOT FOUND\"} (expected: 256)')
	print(f'MIN_FACE_CONFIDENCE: {conf_match.group(1) if conf_match else \"NOT FOUND\"}')
	print(f'ARCFACE_MODEL_VERSION: {version_match.group(1) if version_match else \"NOT FOUND\"}')
	"
	```

	Expected:
	```
	FACE_CROP_SIZE: 256 (expected: 256)
	MIN_FACE_CONFIDENCE: 0.5
	ARCFACE_MODEL_VERSION: arcface-mtcnn-v1
	```

	---

	## Test 4: Process Video Integration

	### 4.1 Verify Scene Threshold Usage

	```bash
	cd /home/user/Search-UI/backend
	grep -n "scene_threshold = processing_settings" process_video.py
	grep -n "gt(scene" process_video.py
	```

	Expected: Both patterns found, showing scene_threshold is used dynamically from settings.

	### 4.2 Verify FACE_CROP_SIZE Usage

	```bash
	cd /home/user/Search-UI/backend
	grep -n "face_search.FACE_CROP_SIZE" process_video.py
	```

	Expected: Found in process_video.py (not hardcoded 128)

	### 4.3 Verify Enhanced Hybrid Approach

	```bash
	cd /home/user/Search-UI/backend
	grep -n "fallback_threshold" process_video.py
	grep -n "int(min_thumbnails \* 0.5)" process_video.py
	```

	Expected: 50% threshold used for fallback decision

	---

	## Test 5: API Endpoints

	### 5.1 Start Backend Server

	```bash
	cd /home/user/Search-UI
	tmux new -d -s backend "cd backend && source venv/bin/activate && uvicorn main:app --reload --host 0.0.0.0 2>&1 \| tee ../backend.log"
	sleep 3
	tail -5 backend.log
	```

	Expected: Server running on http://0.0.0.0:8000

	### 5.2 Test GET /api/settings/series

	```bash
	curl -s "http://localhost:8000/api/settings/series?language=E" \| python3 -m json.tool \| head -50
	```

	Expected Response Structure:
	```json
	{
	"settings": [...],
	"available_categories": [...],
	"presets": {
	"static": {"scene_threshold": 0.12, ...},
	"moderate": {...},
	"dynamic": {...}
	},
	"defaults": {
	"scene_threshold": 0.29,
	"thumbnail_interval": 3.0,
	"min_thumbnails": 30
	}
	}
	```

	### 5.3 Test POST /api/settings/series

	```bash
	# Add a test setting
	curl -X POST "http://localhost:8000/api/settings/series?category=TestDrama&thumbnail_interval=2.0&scene_threshold=0.35&min_thumbnails=50&preset=dynamic"

	# Verify it was added
	curl -s "http://localhost:8000/api/settings/series?language=E" \| python3 -c "
	import sys, json
	data = json.load(sys.stdin)
	for s in data.get('settings', []):
	if s.get('category') == 'TestDrama':
	print(f'Found: {s}')
	break
	else:
	print('Not found')
	"

	# Clean up
	curl -s "http://localhost:8000/api/settings/series?language=E" \| python3 -c "
	import sys, json
	data = json.load(sys.stdin)
	for s in data.get('settings', []):
	if s.get('category') == 'TestDrama':
	print(f'Setting ID to delete: {s[\"id\"]}')
	"
	# Delete using the ID from above
	# curl -X DELETE "http://localhost:8000/api/settings/series/{id}"
	```

	---

	## Test 6: Frontend Settings Page

	### 6.1 Start Frontend Server

	```bash
	cd /home/user/Search-UI
	tmux new -d -s frontend "cd frontend && npm run dev -- --host 0.0.0.0 2>&1 \| tee ../frontend.log"
	sleep 5
	tail -5 frontend.log
	```

	### 6.2 Visual Testing Checklist

	Open browser to http://localhost:5173 (or your network IP:5173)

	Navigate to Settings page and verify:

	- [ ] Presets Section displays all three presets with values:
	- Static: Threshold 0.12, Interval 10s, Min 30
	- Moderate: Threshold 0.20, Interval 5s, Min 35
	- Dynamic: Threshold 0.35, Interval 2s, Min 50
	- Default: Shows current defaults

	- [ ] Add Series Setting Form has:
	- Category dropdown (populated from available categories)
	- Subcategory dropdown (optional)
	- Preset dropdown (static, moderate, dynamic, or custom)
	- Scene Threshold input (0.05-0.5, step 0.01)
	- Fallback Interval input (0.5-60, step 0.5)
	- Min Thumbnails input (10-200, step 5)

	- [ ] Preset Selection auto-fills values:
	- Select "Static" preset → threshold=0.12, interval=10, min=30
	- Select "Dynamic" preset → threshold=0.35, interval=2, min=50
	- Custom settings inputs become disabled when preset selected

	- [ ] Current Settings Table shows columns:
	- Category \| Subcategory \| Threshold \| Interval \| Min \| Preset \| Actions

	- [ ] Add/Delete works:
	- Add a test category with dynamic preset
	- Verify it appears in table with correct values
	- Delete it and verify it's removed

	---

	## Test 7: End-to-End Video Processing

	### 7.1 Test with Single Video

	Prerequisites: Have a test video in the system with known category.

	```bash
	# Process a single video with explicit settings
	curl -X POST "http://localhost:8000/api/process-video?natural_key=YOUR_TEST_KEY&label=480p"
	```

	### 7.2 Verify Processing Metadata Saved

	```bash
	cd /home/user/Search-UI/backend
	python3 -c "
	from database import get_db
	conn = get_db()
	cursor = conn.execute('''
	SELECT natural_key, resolution_label, scene_threshold,
	siglip_model_version, labels_version, duration_seconds
	FROM processed_videos
	ORDER BY processed_at DESC
	LIMIT 5
	''')
	print('Recent processed videos:')
	for row in cursor.fetchall():
	print(f' {row}')
	"
	```

	Expected: Shows processing metadata including scene_threshold, model versions

	### 7.3 Verify Thumbnail Count

	After processing, check the thumbnails directory:

	```bash
	# Replace with actual natural_key and label
	ls -la /home/user/Search-UI/backend/videos/{natural_key}/{label}/thumbnails/ \| wc -l
	```

	Expected: Number of thumbnails appropriate for the content type and settings

	---

	## Automated Test Script

	Run the comprehensive test script:

	```bash
	cd /home/user/Search-UI
	python3 scratchpad/2025-12-25-1708-test-phase1-3.py
	```

	Expected: All 6 modules pass (6/6)

	---

	## Known Limitations

	### Phase 4 (Deferred)
	The Unified Person system (combining face + speaker recognition) requires:
	- ECAPA-TDNN model integration for speaker embeddings
	- New Person management UI
	- Cross-reference system for face ↔ speaker associations
	- This is planned for future implementation

	### Phase 5 (Deferred)
	Validation and performance improvements:
	- Bulk reprocessing detection
	- Performance profiling
	- These will be addressed after Phase 4

	---

	## Troubleshooting

	### Backend Won't Start
	```bash
	# Check for Python environment issues
	cd /home/user/Search-UI/backend
	source venv/bin/activate
	pip list \| grep -E "fastapi\|uvicorn\|sqlite-vec"
	```

	### Database Migration Issues
	```bash
	# Force recreate tables (WARNING: loses data)
	cd /home/user/Search-UI/backend
	python3 -c "
	from database import get_db
	conn = get_db()
	# Drop and recreate problematic tables
	conn.execute('DROP TABLE IF EXISTS image_categories')
	# Re-run the app to recreate
	"
	```

	### Frontend Build Issues
	```bash
	cd /home/user/Search-UI/frontend
	npm install
	npm run build
	```

	---

	## Summary Checklist

	Before processing your video library, verify:

	- [ ] All automated tests pass (6/6)
	- [ ] Backend API returns presets and defaults correctly
	- [ ] Frontend Settings page displays and functions correctly
	- [ ] Test video processes with correct scene threshold
	- [ ] Processing metadata saved to database
	- [ ] Thumbnail count appropriate for content type