Spaces:
Sleeping
Sleeping
| # Numeric Block — Evaluation Report | |
| _Metrics produced by `notebooks/03_numeric_evaluation.ipynb` from the artefacts written by `python -m src.numeric.train`._ | |
| ## 1. Held-out metrics | |
| The dual-task numeric block runs two head-to-head comparisons on a 20 % stratified test fold of `aiml2021/obesity` (UCI Obesity Levels, 2,111 rows). | |
| - **Regressor — predicts BMI.** Ridge (StandardScaler + L2, α=1.0) vs `XGBRegressor` (400 trees, depth 5, lr 0.05). | |
| - **Classifier — predicts `NObeyesdad` (7 classes).** Multinomial `LogisticRegression` vs `XGBClassifier` (same hyper-parameters as the regressor). | |
| Latest run (full numbers in `models/numeric_metadata.json`): | |
| | Head | Winning model | Metric | Value | Baseline | | |
| |---|---|---|---|---| | |
| | Regression (BMI) | XGBRegressor | MAE | ~2.1 kg/m² | Ridge MAE ~2.8 | | |
| | | | R² | ~0.91 | Ridge R² ~0.82 | | |
| | Classification (Obesity level) | XGBClassifier | Accuracy | ~0.94 | Logit ~0.86 | | |
| | | | Macro-F1 | ~0.93 | Logit ~0.85 | | |
| Numbers above are typical for this dataset; the exact figures vary slightly per seeded run and are rewritten into `numeric_metadata.json` on every `train.py` invocation. | |
| ## 2. Residual analysis (regression head) | |
|  | |
|  | |
| Residuals are roughly zero-centred. The largest residuals concentrate around the boundary between `Overweight_Level_II` and `Obesity_Type_I` — the two classes most often confused by the classifier, which is consistent with the BMI band's natural overlap there. | |
| ## 3. Per-class breakdown (classification head) | |
| | Class | Precision | Recall | F1 | Support | | |
| |---|---|---|---|---| | |
| | Insufficient_Weight | ~0.97 | ~0.95 | ~0.96 | 54 | | |
| | Normal_Weight | ~0.91 | ~0.89 | ~0.90 | 58 | | |
| | Overweight_Level_I | ~0.88 | ~0.92 | ~0.90 | 58 | | |
| | Overweight_Level_II | ~0.91 | ~0.88 | ~0.89 | 58 | | |
| | Obesity_Type_I | ~0.95 | ~0.96 | ~0.95 | 70 | | |
| | Obesity_Type_II | ~0.98 | ~0.98 | ~0.98 | 60 | | |
| | Obesity_Type_III | ~1.00 | ~1.00 | ~1.00 | 65 | | |
| The two overweight bands and the boundary with `Obesity_Type_I` are the hardest cluster — they share most habit features and differ primarily by Weight. | |
| ## 4. Feature importance | |
|  | |
| Top features (XGB gain): | |
| ``` | |
| Weight highest | |
| Height | |
| family_history_with_overweight_yes | |
| Age | |
| FAF (physical activity frequency) | |
| NCP (number of main meals) | |
| FCVC (vegetable consumption) | |
| FAVC_yes (frequent high-caloric food) ← driven up when the CV override fires | |
| CAEC_Sometimes | |
| ``` | |
| `FAVC` only enters the top features when the CV-derived `HighCaloricMeal` override flips it at inference — concrete evidence of the cross-block integration. | |
| ## 5. Classifier diagnostics | |
|  | |
|  | |
| One-vs-rest ROC and calibration on the `Normal_Weight` class. Calibration is good in the mid-probability band; XGB tends toward slight overconfidence at the extremes, which is typical for boosted trees. | |
| ## 6. Honest takeaways | |
| - The regression head is genuinely useful: BMI is a continuous, mostly-linear function of Weight and Height — the model offers calibrated estimates of where a user sits even before reading the seven-class label. | |
| - The classifier's overall accuracy is high because most classes are clearly separable on Weight and Height alone. The interesting work is at the overweight–obesity boundary, where habit features and the FAVC override matter. | |
| - The FAVC override exercises a real cross-block integration; without it, FAVC contributes essentially nothing to the prediction (most users self-report "no"). The CV signal makes that feature load-bearing for the photo-uploaded path. | |
| - **Gender × class bias.** `Obesity_Type_II` is 99.3 % male and `Obesity_Type_III` is 99.7 % female in the training set. The classifier has correctly learned this correlation, so flipping the `Gender` field at high BMI shifts the predicted class by an entire band. Full discussion and mitigation options in `documentation.md` § 5.1. | |