File size: 4,153 Bytes
22a6915
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
# models/xgboost

Gradient-boosted tree ensemble for binary focus classification. Primary ML model in the FocusGuard pipeline. Uses the same 10 selected features as the MLP.

## Configuration

Final hyperparameters selected from a 40-trial Optuna sweep:

| Parameter | Value | Source |
|-----------|-------|--------|
| n_estimators | 600 | `xgboost.n_estimators` |
| max_depth | 8 | `xgboost.max_depth` |
| learning_rate | 0.1489 | `xgboost.learning_rate` |
| subsample | 0.9625 | `xgboost.subsample` |
| colsample_bytree | 0.9013 | `xgboost.colsample_bytree` |
| reg_alpha | 1.1407 | `xgboost.reg_alpha` |
| reg_lambda | 2.4181 | `xgboost.reg_lambda` |
| eval_metric | logloss | `xgboost.eval_metric` |

## Training

```bash
python -m models.xgboost.train
```

Reads all parameters from `config/default.yaml`. Uses `XGBClassifier` with early stopping on validation logloss.

## Results

### Pooled random split (70/15/15)

| Accuracy | F1 | ROC-AUC |
|----------|-----|---------|
| 95.87% | 0.959 | 0.991 |

### LOPO cross-validation (9 participants)

| Metric | Value |
|--------|-------|
| LOPO AUC | 0.870 |
| Optimal threshold (Youden's J) | 0.280 |
| F1 at optimal threshold | 0.855 |
| F1 at default 0.50 | 0.832 |
| Improvement from threshold tuning | +2.3 pp |

The ~12 pp drop from pooled to LOPO reflects temporal data leakage and underscores why person-independent evaluation matters.

### Per-person LOPO (at t* = 0.280)

| Held-out | Acc | F1 | Prec | Rec |
|----------|-----|-----|------|-----|
| Abdelrahman | 0.864 | 0.900 | 0.904 | 0.896 |
| Jarek | 0.872 | 0.903 | 0.902 | 0.904 |
| Junhao | 0.890 | 0.901 | 0.841 | 0.971 |
| Kexin | 0.738 | 0.747 | 0.778 | 0.717 |
| Langyuan | 0.655 | 0.677 | 0.548 | 0.888 |
| Mohamed | 0.881 | 0.894 | 0.843 | 0.952 |
| Yingtao | 0.855 | 0.909 | 0.926 | 0.894 |
| Ayten | 0.841 | 0.905 | 0.861 | 0.954 |
| Saba | 0.923 | 0.925 | 0.956 | 0.896 |
| **Mean +/- std** | **.835 +/- .080** | **.862 +/- .082** | **.840 +/- .115** | **.897 +/- .070** |

95% CI for mean F1: [0.799, 0.926]

### Feature importance (XGBoost gain)

Top 5: `s_face` (10.27), `ear_right` (9.54), `head_deviation` (8.83), `ear_avg` (6.96), `perclos` (5.68)

## ClearML integration

```bash
USE_CLEARML=1 python -m models.xgboost.train
```

Same enrichment as MLP: hyperparameters, per-round scalars, confusion matrices, ROC curves, model registration, dataset stats, and reproducibility artifacts.

## Sweeps

### ClearML HPO (remote)

```bash
USE_CLEARML=1 python -m models.xgboost.sweep
```

Launches a `HyperParameterOptimizer` controller on ClearML that clones the base training task and runs grid/random search across workers.

### Local Optuna sweep

```bash
python -m models.xgboost.sweep_local
```

40-trial TPE sampler, optimising LOPO F1. Search space: n_estimators [100-1000], max_depth [3-10], learning_rate [0.01-0.3], subsample [0.6-1.0], colsample_bytree [0.6-1.0], reg_alpha/lambda [0-5].

### Export local sweep results from ClearML

```bash
python -m models.xgboost.fetch_sweep_results
```

Writes `models/xgboost/sweep_results_all_40.csv` using the metrics logged by `sweep_local.py` (`val_loss`, `val_accuracy`, `val_f1`) plus each trial's hyperparameters.

Default export behavior is strict and deterministic:
- only tasks named like `XGBoost Sweep Trial #...`
- only the latest 40 matching tasks
- skips tasks with missing or zero `val_loss` / `val_accuracy` / `val_f1`
- ranks by `val_f1` (then `val_loss`, then `val_accuracy`)

Useful options:

```bash
python -m models.xgboost.fetch_sweep_results --limit 0 --keep-zero-metrics
python -m models.xgboost.fetch_sweep_results --name-prefix "XGBoost Sweep Trial #" --limit 40
python -m models.xgboost.fetch_sweep_results --compute-missing-val-accuracy --sort-by val_f1
```

## Outputs

| File | Location |
|------|----------|
| Best model | `checkpoints/xgboost_face_orientation_best.json` |
| Scaler | `checkpoints/scaler_xgboost.joblib` |
| Test predictions | `evaluation/logs/xgboost_test_predictions.csv` |
| Test metrics | `evaluation/logs/xgboost_test_metrics_summary.json` |
| Feature importance | `evaluation/logs/xgboost_feature_importance.json` |