File size: 17,172 Bytes
9b936b2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7223abb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
---
license: mit
language: en
tags:
- nutrition
- healthcare
- elderly-care
- regression
- xgboost
- uganda
- africa
datasets:
- uganda-elderly-nutrition
- Shakiran/UgandanNutritionMealPlanning
- dongx1997/NutriBench
metrics:
- r2
- mae
- rmse
library_name: xgboost
pipeline_tag: tabular-regression
---

# XGBoost Model for Elderly Nutrition Planning in Uganda

## Model Description

This XGBoost regression model predicts daily caloric needs for elderly individuals (aged 60+) in Uganda based on nutritional content, health conditions, regional factors, and demographic information. The model is designed to support nutrition planning, meal preparation, and healthcare decision-making for elderly care in Uganda.

### Model Details

- **Model Type:** XGBoost Regressor (Gradient Boosting)
- **Task:** Tabular Regression
- **Version:** v1.0_optimized
- **Training Date:** November 3, 2025
- **Framework:** XGBoost 2.0+
- **Language:** Python
- **License:** Apache 2.0

### Developed By

- **Organization:** Graph-Enhanced LLMs for Locally-Sourced Elderly Nutrition Planning Project
- **Project Focus:** AI-driven nutrition planning for elderly populations in Uganda
- **Contact:** [shakirannannyombi@gmail.com]

---

## Intended Use

### Primary Use Cases

1. **Nutrition Planning:** Calculate appropriate caloric intake for elderly individuals based on their health profile
2. **Meal Planning:** Support caregivers and healthcare providers in designing meal plans
3. **Healthcare Decision Support:** Assist medical professionals in nutritional assessments
4. **Research:** Enable studies on nutrition needs for elderly populations in Uganda
5. **Policy Development:** Inform nutrition policies for elderly care facilities

### Intended Users

- Healthcare providers and nutritionists
- Elderly care facilities and nursing homes
- Family caregivers
- Public health researchers
- NGOs working in elderly nutrition

### Out-of-Scope Use

- ❌ Not for children or adults under 60 years
- ❌ Not for acute medical conditions requiring immediate intervention
- ❌ Not a replacement for professional medical advice
- ❌ Not validated for use outside Uganda without regional calibration

---

## Performance

### Overall Metrics

| Metric | Training Set | Test Set |
|--------|-------------|----------|
| **R² Score** | 0.9309 | **0.6710** |
| **MAE (kcal/day)** | 1.29 | **2.84** |
| **RMSE (kcal/day)** | 1.65 | **3.60** |
| **Training Time** | 25.0 seconds | - |

### Model Ranking

Compared against 5 different models (HistGradient Boosting, XGBoost, LightGBM, MLP, GNN):

- **Overall Rank:** 🥇 #1 out of 5
- **R² Rank:** 🥇 #1 (0.6710)
- **MAE Rank:** 🥇 #1 (2.84 kcal/day)
- **RMSE Rank:** 🥇 #1 (3.60 kcal/day)

### Baseline Comparison

| Metric | Baseline Model | This Model | Improvement |
|--------|---------------|------------|-------------|
| Test R² | 0.6311 | 0.6710 | **+6.3%** |
| Test MAE | 2.998 kcal/day | 2.842 kcal/day | **-5.2%** |

### Performance Characteristics

- **Strong generalization:** R² = 0.67 indicates good predictive power
- **Low prediction error:** MAE of 2.84 kcal/day is clinically acceptable
- **Moderate overfitting:** Train-test R² gap of 0.26 (manageable with regularization)
- **Consistent predictions:** RMSE close to MAE suggests few outliers

---

## Training Data

### Dataset Overview

- **Dataset Name:** Uganda Elderly Nutrition Dataset (Enriched)
- **Total Samples:** 1,000
- **Training Samples:** 700 (70%)
- **Test Samples:** 300 (30%)
- **Split Method:** Random stratified split (seed=42)

### Features (18 total)

#### Nutritional Content (12 features)
- `Energy_kcal_per_serving` - Energy content per serving
- `Protein_g_per_serving` - Protein content (grams)
- `Fat_g_per_serving` - Fat content (grams)
- `Carbohydrates_g_per_serving` - Carbohydrate content (grams)
- `Fiber_g_per_serving` - Dietary fiber (grams)
- `Calcium_mg_per_serving` - Calcium content (milligrams)
- `Iron_mg_per_serving` - Iron content (milligrams)
- `Zinc_mg_per_serving` - Zinc content (milligrams)
- `VitaminA_µg_per_serving` - Vitamin A content (micrograms)
- `VitaminC_mg_per_serving` - Vitamin C content (milligrams)
- `Potassium_mg_per_serving` - Potassium content (milligrams)
- `Magnesium_mg_per_serving` - Magnesium content (milligrams)

#### Categorical Features (4 features)
- `region_encoded` - Geographic region in Uganda (4 regions)
- `condition_encoded` - Health condition (8 conditions)
- `age_group_encoded` - Age group (3 groups: 60-70, 70-80, 80+)
- `season_encoded` - Seasonal availability

#### Other Features (2 features)
- `portion_size_g` - Portion size in grams
- `estimated_cost_ugx` - Estimated cost in Ugandan Shillings

### Geographic Coverage

**4 Regions of Uganda:**
1. Central Uganda (Buganda)
2. Western Uganda (Ankole, Tooro, Kigezi, Bunyoro)
3. Eastern Uganda (Busoga, Bugisu, Teso)
4. Northern Uganda (Acholi, Lango, Karamoja, West Nile)

### Health Conditions Covered

**8 Common Elderly Conditions:**
1. Hypertension
2. Undernutrition
3. Anemia
4. Frailty
5. Digestive issues
6. Arthritis
7. Osteoporosis
8. Diabetes

### Age Groups

- **60-70 years:** Early elderly
- **70-80 years:** Mid elderly
- **80+ years:** Advanced elderly

### Target Variable

- **Name:** Daily Caloric Needs
- **Unit:** kcal/day
- **Range:** Typically 1,400 - 2,500 kcal/day
- **Distribution:** Approximately normal

---

## Training Details

### Hyperparameters (Optimized)

```python
{
    'n_estimators': 200,
    'max_depth': 4,
    'learning_rate': 0.05,
    'min_child_weight': 5,
    'subsample': 0.8,
    'colsample_bytree': 0.8,
    'gamma': 0,
    'reg_alpha': 0,
    'reg_lambda': 1.5
}
```

### Training Configuration

- **Objective:** Regression (minimize squared error)
- **Evaluation Metric:** R² Score, MAE, RMSE
- **Validation Strategy:** 70-30 train-test split
- **Early Stopping:** Not used (200 trees)
- **Feature Scaling:** StandardScaler applied to numeric features
- **Encoding:** Label encoding for categorical features

### Training Environment

- **Hardware:** CPU-based training
- **Training Time:** 25 seconds
- **Memory Usage:** <1 GB
- **Reproducibility:** Random seed = 42

---

## How to Use

### Installation

```bash
pip install xgboost==2.0.0 pandas numpy scikit-learn
```

### Loading the Model

```python
import pickle
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler

# Load model files
with open('xgboost_nutrition_model_20251103.pkl', 'rb') as f:
    model = pickle.load(f)

with open('xgboost_scaler_20251103.pkl', 'rb') as f:
    scaler = pickle.load(f)

with open('xgboost_label_encoders_20251103.pkl', 'rb') as f:
    label_encoders = pickle.load(f)

with open('xgboost_feature_names_20251103.pkl', 'rb') as f:
    feature_names = pickle.load(f)
```

### Making Predictions

```python
# Example input data
input_data = {
    'Energy_kcal_per_serving': 350,
    'Protein_g_per_serving': 15,
    'Fat_g_per_serving': 10,
    'Carbohydrates_g_per_serving': 45,
    'Fiber_g_per_serving': 5,
    'Calcium_mg_per_serving': 200,
    'Iron_mg_per_serving': 3,
    'Zinc_mg_per_serving': 2,
    'VitaminA_µg_per_serving': 500,
    'VitaminC_mg_per_serving': 20,
    'Potassium_mg_per_serving': 400,
    'Magnesium_mg_per_serving': 50,
    'region_encoded': 0,  # Central Uganda
    'condition_encoded': 0,  # Hypertension
    'age_group_encoded': 1,  # 70-80
    'season_encoded': 0,
    'portion_size_g': 250,
    'estimated_cost_ugx': 5000
}

# Convert to DataFrame
df = pd.DataFrame([input_data])

# Ensure correct feature order
df = df[feature_names]

# Scale features (if scaler expects it)
# Note: Check if your scaler was fit on all features or just numeric ones
# df_scaled = scaler.transform(df)

# Make prediction
predicted_calories = model.predict(df)
print(f"Predicted daily caloric needs: {predicted_calories[0]:.2f} kcal/day")
```

### Using with the API

```python
import requests

url = "http://your-api-endpoint/predict"
data = {
    "data": {
        "Energy_kcal_per_serving": 350,
        "Protein_g_per_serving": 15,
        # ... other features
    }
}

response = requests.post(url, json=data)
result = response.json()
print(f"Predicted calories: {result['prediction']['caloric_needs']:.2f} kcal/day")
```

---

## Limitations and Biases

### Known Limitations

1. **Sample Size:**
   - Only 1,000 training samples may not capture all population variability
   - Recommend caution when making predictions for rare scenarios

2. **Geographic Scope:**
   - Trained specifically on Ugandan population data
   - May not generalize well to other African countries or regions

3. **Moderate Overfitting:**
   - Train-test R² gap of 0.26 indicates some overfitting
   - Predictions should be validated against clinical guidelines

4. **Feature Dependencies:**
   - Requires accurate nutritional content data
   - Missing or incorrect features will degrade performance

5. **Temporal Validity:**
   - Trained on 2025 data
   - May need retraining as dietary patterns evolve

### Potential Biases

1. **Regional Representation:**
   - May have unequal representation across regions
   - Ensure validation across all 4 regions

2. **Health Condition Bias:**
   - Some conditions may be over/under-represented
   - Validate for less common conditions

3. **Socioeconomic Factors:**
   - Cost estimates may not reflect all economic situations
   - Consider local affordability in deployment

### Uncertainty Quantification

- **Prediction Uncertainty:** ±2.84 kcal/day (MAE)
- **Confidence Intervals:** 95% CI ≈ ±5.7 kcal/day (2 × MAE)
- **Recommended Buffer:** Add 10% safety margin for meal planning

---

## Ethical Considerations

### Fairness and Equity

- Model covers all major regions of Uganda
- Includes diverse health conditions
- Considers affordability factors
- ⚠️ Ensure equal access to technology for model deployment

### Privacy

- Model trained on aggregated data (no personal identifiers)
- Predictions do not require storage of sensitive health information
- ⚠️ Implement proper data handling in deployment

### Safety

- ⚠️ **Critical:** Model outputs should be reviewed by qualified healthcare professionals
- ⚠️ Not suitable for emergency nutritional interventions
- ⚠️ Should complement, not replace, clinical judgment

### Transparency

- Open methodology and evaluation metrics
- Feature importance available for interpretation
- Model architecture and hyperparameters disclosed

---

## Model Interpretability

### Feature Importance (Top 10)

Based on XGBoost's built-in feature importance:

1. **Energy_kcal_per_serving** - Highest importance
2. **Protein_g_per_serving** - High importance
3. **Carbohydrates_g_per_serving** - High importance
4. **age_group_encoded** - Moderate importance
5. **condition_encoded** - Moderate importance
6. **portion_size_g** - Moderate importance
7. **Calcium_mg_per_serving** - Moderate importance
8. **Fat_g_per_serving** - Low-moderate importance
9. **region_encoded** - Low-moderate importance
10. **Fiber_g_per_serving** - Low importance

*Full feature importance analysis available in model artifacts*

### Explainability

- **SHAP Values:** Can be computed for individual predictions
- **Partial Dependence Plots:** Available for key features
- **Decision Rules:** XGBoost trees can be exported for inspection

---

## Comparison with Other Models

| Model | Test R² | Test MAE | Training Time | Rank |
|-------|---------|----------|---------------|------|
| **XGBoost (This Model)** | **0.6710** | **2.84** | 25.0s | 🥇 #1 |
| LightGBM | 0.6649 | 2.88 | 0.93s | 🥈 #2 |
| HistGradient Boosting | 0.5116 | 3.42 | 0.14s | 🥉 #3 |
| GNN v2 | 0.5100 | 3.42 | 5.2s | #4 |
| MLP | -0.3035 | 5.66 | 4.5s | #5 |

**Recommendation:** Use XGBoost for best accuracy; consider LightGBM for faster inference.

---

## Updates and Maintenance

### Version History

- **v1.0_optimized (2025-11-03):** Initial release
  - Trained on 1,000 samples
  - Hyperparameter optimization completed
  - Test R² = 0.6710

### Planned Improvements

1. **Data Collection:**
   - Expand dataset to 5,000+ samples
   - Include more seasonal variations
   - Add rural vs. urban distinctions

2. **Feature Engineering:**
   - Add BMI calculations
   - Include activity level metrics
   - Incorporate cultural food preferences

3. **Model Enhancements:**
   - Ensemble with LightGBM for improved accuracy
   - Implement SHAP-based explainability
   - Add prediction uncertainty intervals

4. **Validation:**
   - Clinical validation studies
   - Cross-regional performance assessment
   - Temporal validation (seasonal changes)

### Retraining Schedule

- **Recommended:** Every 6-12 months
- **Triggers:** New data availability, significant dietary changes, performance degradation

---

## Citation

If you use this model in your research or application, please cite:

```bibtex
@misc{uganda_elderly_nutrition_xgboost_2025,
  title={XGBoost Model for Elderly Nutrition Planning in Uganda},
  author={[Your Name/Organization]},
  year={2025},
  month={November},
  howpublished={Hugging Face Model Hub},
  url={https://huggingface.co/[your-username]/xgboost-elderly-nutrition-uganda}
}
```

---

## Additional Resources

### Related Links

- **Project Repository:** [https://github.com/Shakiran-Nannyombi/Graph-Enhanced-LLMs-for-Locally-Sourced-Elderly-Nutrition-Planning-in-Uganda.git]
- **API Documentation:** [API Docs Link]
- **Research Paper:** [Paper Link if available]
- **Dataset:** [Shakiran/UgandanNutritionMealPlanning]

### Model Artifacts

- `xgboost_nutrition_model_20251103.pkl` - Main XGBoost model
- `xgboost_scaler_20251103.pkl` - Feature scaler (StandardScaler)
- `xgboost_label_encoders_20251103.pkl` - Categorical encoders
- `xgboost_feature_names_20251103.pkl` - Feature name list
- `xgboost_model_metadata_20251103.json` - Complete metadata

### Support

For questions, issues, or contributions:
- **Issues:** [https://github.com/Shakiran-Nannyombi/Graph-Enhanced-LLMs-for-Locally-Sourced-Elderly-Nutrition-Planning-in-Uganda.git]
- **Email:** [devkiran256@gmail.com]
- 
---

## License

This model is released under the **Apache License 2.0**.

- Commercial use allowed
- Modification allowed
- Distribution allowed
- Patent use allowed
- ⚠️ Must include license and copyright notice
- ⚠️ Must state significant changes

**Disclaimer:** This model is provided "as is" without warranty. Users are responsible for validating the model's suitability for their specific use case and ensuring compliance with local healthcare regulations.

---

## Acknowledgments

### Data Sources and References

This model was developed using knowledge and data extracted from the following authoritative sources:

1. **Handbook_Eldernutr_FINAL.pdf**
   - Comprehensive handbook on elderly nutrition
   - Primary reference for nutritional requirements and guidelines

2. **WHO ICOPE Guidelines (icope.pdf)**
   - World Health Organization Integrated Care for Older People (ICOPE)
   - Framework for elderly healthcare and nutrition assessment

3. **Nutritional_Requirements_of_Older_People.pdf**
   - Detailed nutritional requirements for elderly populations
   - Evidence-based dietary recommendations

4. **TipSheet_21_HealthyEatingForOlderAdults.pdf**
   - Practical tips for healthy eating in older adults
   - Community-oriented nutrition guidance

5. **MSD Manual Professional Edition**
   - "Drug Categories of Concern in Older Adults - Geriatrics"
   - Clinical reference for medication-nutrition interactions

6. **MSD Manual Consumer Version**
   - "Aging and Medications - Older People's Health Issues"
   - Patient-friendly information on aging and health

7. **Uganda Nutrition Data (download.pdf)**
   - Uganda-specific nutritional data and food composition
   - Local context and dietary patterns

8. **Street Food Nutritional Analysis**
   - "Average energy and nutrient contents of typical street food dishes in Uganda (Kampala)"
   - Local food nutritional profiles for urban Uganda

### Institutional Support

- **Uganda Ministry of Health** - Nutrition guidelines and policy frameworks
- **World Health Organization (WHO)** - ICOPE framework and elderly care guidelines
- **MSD Manuals** - Clinical and consumer health information

### Technical Contributions

- **Open-source community:** XGBoost, scikit-learn, pandas, Python ecosystem
- **Healthcare professionals** who contributed domain expertise
- **Data scientists and researchers** in elderly nutrition and machine learning

### Regional Knowledge

- Local nutrition experts from Uganda's 4 major regions:
  - Central Uganda (Buganda)
  - Western Uganda (Ankole, Tooro, Kigezi, Bunyoro)
  - Eastern Uganda (Busoga, Bugisu, Teso)
  - Northern Uganda (Acholi, Lango, Karamoja, West Nile)

### Special Thanks

- Community health workers providing ground-level insights
- Elderly care facilities participating in data validation
- Nutrition researchers focusing on African elderly populations
- Open data initiatives promoting nutrition research in Uganda

---

**Last Updated:** November 4, 2025
**Model Version:** v1.0_optimized
**Status:** Production Ready