File size: 10,078 Bytes
0ec1ccb | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 | # Testing & Validation Guide
This guide outlines how to test and validate that the improvements have fixed the issues and improved system performance.
## Quick Validation Checklist
### β
Phase 1: Critical Fixes Validation
#### 1.1 Test Class Indexing Fix
**Goal**: Verify mAP is no longer 0%
```bash
# Run a short training run (1-2 epochs) to check initial metrics
python scripts/train_detr.py \
--config configs/training.yaml \
--train-dir datasets/train \
--val-dir datasets/val \
--output-dir models
# Check validation output - should see:
# - Player mAP > 0 (was 0.00%)
# - Ball mAP > 0 (was 0.00%)
# - No "All Background" warnings
```
**Expected Results**:
- β
Player mAP@0.5 > 0.0 (should be > 0.10 after 1 epoch)
- β
Ball mAP@0.5 > 0.0 (should be > 0.05 after 1 epoch)
- β
No zero recall/precision for players
#### 1.2 Test Focal Loss vs Class Weights
**Goal**: Verify Focal Loss improves precision over 25x class weights
```bash
# Train with Focal Loss (current config)
python scripts/train_detr.py --config configs/training.yaml
# Monitor ball precision in MLflow/TensorBoard
# Should see: Ball Precision > 0.14% (previous was 0.14%)
```
**Expected Results**:
- β
Ball Precision@0.5 > 0.20 (improved from 0.14%)
- β
Ball Recall@0.5 > 0.50 (maintains or improves from 58%)
- β
Fewer false positives (lower avg predictions per image)
### β
Phase 2: Architecture Validation
#### 2.1 Test RF-DETR Integration
**Goal**: Verify RF-DETR can be loaded (full training requires RF-DETR's native API)
```python
# Quick test script
from src.training.model import get_detr_model
import yaml
config = yaml.safe_load(open('configs/training.yaml'))
config['model']['architecture'] = 'rfdetr'
try:
model = get_detr_model(config['model'], config['training'])
print("β
RF-DETR model loaded successfully")
except Exception as e:
print(f"β οΈ RF-DETR not available: {e}")
print("Note: Full RF-DETR training requires native API")
```
### β
Phase 3: Advanced Features Validation
#### 3.1 Test Copy-Paste Augmentation
**Goal**: Verify ball class balancing works
```python
# Test augmentation
from src.training.augmentation import CopyPasteAugmentation
from PIL import Image
import torch
# Create dummy ball patches
ball_patches = [(Image.new('RGB', (20, 20), 'white'), {})]
aug = CopyPasteAugmentation(prob=1.0, max_pastes=3)
aug.set_ball_patches(ball_patches)
# Test on sample image
img = Image.open('datasets/train/images/sample.jpg')
target = {
'boxes': torch.tensor([[100, 100, 150, 150]]),
'labels': torch.tensor([1]) # 1-based: player
}
aug_img, aug_target = aug(img, target)
print(f"Original boxes: {len(target['boxes'])}")
print(f"Augmented boxes: {len(aug_target['boxes'])}")
# Should have more boxes (pasted balls)
```
**Expected Results**:
- β
More ball annotations in training batches
- β
Improved ball recall during training
#### 3.2 Test SAHI Inference
**Goal**: Verify small ball detection improves
```python
# Test SAHI on validation image
from src.training.sahi_inference import sahi_predict
from PIL import Image
import torch
model = load_trained_model() # Your trained model
img = Image.open('datasets/val/images/sample.jpg')
# Standard inference
standard_preds = model([preprocess(img)])
# SAHI inference
sahi_preds = sahi_predict(model, img, slice_size=640, overlap_ratio=0.2)
print(f"Standard detections: {len(standard_preds['boxes'])}")
print(f"SAHI detections: {len(sahi_preds['boxes'])}")
# SAHI should detect more small balls
```
**Expected Results**:
- β
More ball detections with SAHI
- β
Better recall for small balls (< 20x20 pixels)
#### 3.3 Test ByteTrack Integration
**Goal**: Verify temporal tracking consistency
```python
# Test ByteTrack on video sequence
from src.tracker import ByteTrackerWrapper
import torch
tracker = ByteTrackerWrapper(frame_rate=30)
# Simulate detections across frames
for frame_idx in range(10):
detections = {
'boxes': torch.tensor([[100, 100, 120, 120]]),
'scores': torch.tensor([0.8]),
'labels': torch.tensor([1]) # ball
}
tracked = tracker.update(detections, (1080, 1920))
print(f"Frame {frame_idx}: {len(tracked)} tracks")
if tracked:
print(f" Track ID: {tracked[0]['track_id']}")
```
**Expected Results**:
- β
Consistent track IDs across frames
- β
Ball tracks persist even with low-confidence detections
#### 3.4 Test Homography/GSR
**Goal**: Verify pixel-to-pitch coordinate transformation
```python
# Test homography estimation
from src.analysis.homography import HomographyEstimator
import numpy as np
from PIL import Image
estimator = HomographyEstimator(pitch_width=105.0, pitch_height=68.0)
img = np.array(Image.open('datasets/val/images/sample.jpg'))
# Estimate homography (auto or manual)
success = estimator.estimate(img)
if success:
# Transform a point
pixel_point = (960, 540) # Center of 1920x1080 image
pitch_point = estimator.transform(pixel_point)
print(f"Pixel {pixel_point} -> Pitch {pitch_point}")
```
**Expected Results**:
- β
Homography matrix estimated successfully
- β
Points transform correctly to pitch coordinates
### β
Phase 4: Data Quality Validation
#### 4.1 Test CLAHE Enhancement
**Goal**: Verify contrast improvement for synthetic fog
```python
# Visual test
from src.training.augmentation import CLAHEAugmentation
from PIL import Image
aug = CLAHEAugmentation(clip_limit=2.0, tile_grid_size=(8, 8))
img = Image.open('datasets/train/images/sample.jpg')
target = {'boxes': torch.tensor([]), 'labels': torch.tensor([])}
enhanced_img, _ = aug(img, target)
enhanced_img.save('enhanced_sample.jpg')
# Compare visually - should see better contrast
```
#### 4.2 Test Motion Blur
**Goal**: Verify motion blur augmentation works
```python
# Test motion blur
from src.training.augmentation import MotionBlurAugmentation
aug = MotionBlurAugmentation(prob=1.0, max_kernel_size=15)
img = Image.open('datasets/train/images/sample.jpg')
target = {'boxes': torch.tensor([]), 'labels': torch.tensor([])}
blurred_img, _ = aug(img, target)
blurred_img.save('blurred_sample.jpg')
```
## Comprehensive Training Test
### Full Training Run with Monitoring
```bash
# 1. Install new dependencies
pip install -r requirements.txt
# 2. Start training with all improvements
python scripts/train_detr.py \
--config configs/training.yaml \
--train-dir datasets/train \
--val-dir datasets/val \
--output-dir models
# 3. Monitor in MLflow (recommended)
mlflow ui --backend-store-uri file:./mlruns
# Open http://localhost:5000
# 4. Or monitor in TensorBoard
tensorboard --logdir logs
# Open http://localhost:6006
```
### Key Metrics to Monitor
**Training Metrics** (should improve):
- Training loss: Should decrease smoothly
- Focal Loss component: Should focus on hard examples
- Learning rate: Should follow cosine schedule
**Validation Metrics** (critical improvements):
- **Player mAP@0.5**: Target > 0.85 (was 0.00%)
- **Player Recall@0.5**: Target > 0.95 (was 0.00%)
- **Ball mAP@0.5**: Target > 0.70 (was low)
- **Ball Precision@0.5**: Target > 0.70 (was 0.14%)
- **Ball Recall@0.5**: Target > 0.80 (was ~58%)
- **Ball Avg Predictions**: Should be ~1.0 per image (not excessive)
### Comparison: Before vs After
Create a comparison script:
```python
# scripts/compare_metrics.py
import json
# Load old metrics (from previous training)
with open('old_metrics.json') as f:
old_metrics = json.load(f)
# Load new metrics (from current training)
with open('new_metrics.json') as f:
new_metrics = json.load(f)
print("Metric Comparison:")
print(f"Player mAP: {old_metrics['player_map']:.4f} -> {new_metrics['player_map']:.4f}")
print(f"Ball Precision: {old_metrics['ball_precision']:.4f} -> {new_metrics['ball_precision']:.4f}")
print(f"Ball Recall: {old_metrics['ball_recall']:.4f} -> {new_metrics['ball_recall']:.4f}")
```
## Quick Diagnostic Script
Run this to verify all fixes are working:
```bash
# scripts/quick_validation.py
python -c "
from src.training.dataset import CocoDataset
from src.training.model import get_detr_model
import yaml
# Test 1: Dataset labels are 1-based
config = yaml.safe_load(open('configs/training.yaml'))
dataset = CocoDataset('datasets/train', transforms=None)
sample = dataset[0]
labels = sample[1]['labels']
print(f'β
Dataset labels: {labels.unique().tolist()} (should be [1, 2] for 1-based)')
# Test 2: Model can be created
model = get_detr_model(config['model'], config['training'])
print('β
Model created successfully')
# Test 3: Focal Loss config
focal_enabled = config['training']['focal_loss']['enabled']
print(f'β
Focal Loss enabled: {focal_enabled}')
# Test 4: Class weights disabled
weights_enabled = config['training']['class_weights']['enabled']
print(f'β
Class weights disabled: {not weights_enabled}')
print('\nπ All critical fixes verified!')
"
```
## Expected Timeline
- **Epoch 1-5**: Should see mAP > 0 immediately (fixes indexing bug)
- **Epoch 10**: Ball precision should improve (Focal Loss working)
- **Epoch 20**: Copy-Paste should show improved ball recall
- **Epoch 50+**: Should approach target metrics
## Troubleshooting
If metrics don't improve:
1. **Still seeing 0% mAP?**
- Check dataset labels are 1-based: `dataset[0][1]['labels']`
- Verify model expects 1-based: Check `model.py` line 119
2. **Ball precision still low?**
- Verify Focal Loss is enabled in config
- Check Focal Loss is being applied (add debug prints)
3. **No improvement with Copy-Paste?**
- Verify ball patches are being extracted
- Check augmentation is enabled in config
4. **SAHI not working?**
- Verify image slicing is correct
- Check NMS is merging predictions properly
## Next Steps After Validation
Once improvements are confirmed:
1. **Fine-tune hyperparameters**: Adjust Focal Loss alpha/gamma
2. **Optimize augmentations**: Tune Copy-Paste probability
3. **Scale up training**: Increase epochs if metrics still improving
4. **Deploy improvements**: Use trained model for inference
|