test_final / evaluation /GROUPED_SPLIT_BENCHMARK.md
k22056537
feat: sync integration updates across app and ML pipeline
eb4abb8

Grouped vs pooled split benchmark

This compares the same XGBoost config under two evaluation protocols.

Config: {'n_estimators': 600, 'max_depth': 8, 'learning_rate': 0.1489, 'subsample': 0.9625, 'colsample_bytree': 0.9013, 'reg_alpha': 1.1407, 'reg_lambda': 2.4181, 'eval_metric': 'logloss'} Quick mode: yes (n_estimators=200)

Protocol Accuracy F1 (weighted) ROC-AUC
Pooled random split (70/15/15) 0.9510 0.9507 0.9869
Grouped LOPO (9 folds) 0.8303 0.8304 0.8801

Use grouped LOPO as the primary generalisation metric when reporting model quality.