File size: 4,396 Bytes
021cba1
 
 
 
eb4abb8
 
 
 
 
 
 
 
 
021cba1
 
eb4abb8
 
021cba1
 
 
 
 
eb4abb8
021cba1
 
 
 
 
eb4abb8
021cba1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eb4abb8
021cba1
eb4abb8
021cba1
 
 
 
 
 
 
 
 
 
eb4abb8
021cba1
eb4abb8
021cba1
eb4abb8
021cba1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
eb4abb8
021cba1
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
# Threshold Justification Report

Auto-generated by `evaluation/justify_thresholds.py` using LOPO cross-validation over 9 participants (~145k samples).

## 0. Latest random split checkpoints (15% test split)

From the latest training runs:

| Model | Accuracy | F1 | ROC-AUC |
|-------|----------|-----|---------|
| XGBoost | 95.87% | 0.9585 | 0.9908 |
| MLP | 92.92% | 0.9287 | 0.9714 |

## 1. ML Model Decision Thresholds

XGBoost config used for this report: `{'n_estimators': 600, 'max_depth': 8, 'learning_rate': 0.1489, 'subsample': 0.9625, 'colsample_bytree': 0.9013, 'reg_alpha': 1.1407, 'reg_lambda': 2.4181, 'eval_metric': 'logloss'}`.

Thresholds selected via **Youden's J statistic** (J = sensitivity + specificity - 1) on pooled LOPO held-out predictions.

| Model | LOPO AUC | Optimal Threshold (Youden's J) | F1 @ Optimal | F1 @ 0.50 |
|-------|----------|-------------------------------|--------------|-----------|
| MLP | 0.8624 | **0.228** | 0.8578 | 0.8149 |
| XGBoost | 0.8695 | **0.280** | 0.8549 | 0.8324 |

![MLP ROC](plots/roc_mlp.png)

![XGBoost ROC](plots/roc_xgboost.png)

## 2. Geometric Pipeline Weights (s_face vs s_eye)

Grid search over face weight alpha in {0.2 ... 0.8}. Eye weight = 1 - alpha. Threshold per fold via Youden's J.

| Face Weight (alpha) | Mean LOPO F1 |
|--------------------:|-------------:|
| 0.2 | 0.7926 |
| 0.3 | 0.8002 |
| 0.4 | 0.7719 |
| 0.5 | 0.7868 |
| 0.6 | 0.8184 |
| 0.7 | 0.8195 **<-- selected** |
| 0.8 | 0.8126 |

**Best:** alpha = 0.7 (face 70%, eye 30%)

![Geometric weight search](plots/geo_weight_search.png)

## 3. Hybrid Pipeline Weights (MLP vs Geometric)

Grid search over w_mlp in {0.3 ... 0.8}. w_geo = 1 - w_mlp. Geometric sub-score uses same weights as geometric pipeline (face=0.7, eye=0.3). If you change geometric weights, re-run this script — optimal w_mlp can shift.

| MLP Weight (w_mlp) | Mean LOPO F1 |
|-------------------:|-------------:|
| 0.3 | 0.8409 **<-- selected** |
| 0.4 | 0.8246 |
| 0.5 | 0.8164 |
| 0.6 | 0.8106 |
| 0.7 | 0.8039 |
| 0.8 | 0.8016 |

**Best:** w_mlp = 0.3 (MLP 30%, geometric 70%)

![Hybrid weight search](plots/hybrid_weight_search.png)

## 4. Eye and Mouth Aspect Ratio Thresholds

### EAR (Eye Aspect Ratio)

Reference: Soukupova & Cech, "Real-Time Eye Blink Detection Using Facial Landmarks" (2016) established EAR ~ 0.2 as a blink threshold.

Our thresholds define a linear interpolation zone around this established value:

| Constant | Value | Justification |
|----------|------:|---------------|
| `ear_closed` | 0.16 | Below this, eyes are fully shut. 16.3% of samples fall here. |
| `EAR_BLINK_THRESH` | 0.21 | Blink detection point; close to the 0.2 reference. 21.2% of samples below. |
| `ear_open` | 0.30 | Above this, eyes are fully open. 70.4% of samples here. |

Between 0.16 and 0.30 the `_ear_score` function linearly interpolates from 0 to 1, providing a smooth transition rather than a hard binary cutoff.

![EAR distribution](plots/ear_distribution.png)

### MAR (Mouth Aspect Ratio)

| Constant | Value | Justification |
|----------|------:|---------------|
| `MAR_YAWN_THRESHOLD` | 0.55 | Only 1.7% of samples exceed this, confirming it captures genuine yawns without false positives. |

![MAR distribution](plots/mar_distribution.png)

## 5. Other Constants

| Constant | Value | Rationale |
|----------|------:|-----------|
| `gaze_max_offset` | 0.28 | Max iris displacement (normalised) before gaze score drops to zero. Corresponds to ~56% of the eye width; beyond this the iris is at the extreme edge. |
| `max_angle` | 22.0 deg | Head deviation beyond which face score = 0. Based on typical monitor-viewing cone: at 60 cm distance and a 24" monitor, the viewing angle is ~20-25 degrees. |
| `roll_weight` | 0.5 | Roll is less indicative of inattention than yaw/pitch (tilting head doesn't mean looking away), so it's down-weighted by 50%. |
| `EMA alpha` | 0.3 | Smoothing factor for focus score. Gives ~3-4 frame effective window; balances responsiveness vs flicker. |
| `grace_frames` | 15 | ~0.5 s at 30 fps before penalising no-face. Allows brief occlusions (e.g. hand gesture) without dropping score. |
| `PERCLOS_WINDOW` | 60 frames | 2 s at 30 fps; standard PERCLOS measurement window (Dinges & Grace, 1998). |
| `BLINK_WINDOW_SEC` | 30 s | Blink rate measured over 30 s; typical spontaneous blink rate is 15-20/min (Bentivoglio et al., 1997). |