GencoDiv's picture
Update README for v2: 12 categories, R2=0.992, no Hijri
5231c91 verified
---
license: mit
tags:
- tabular-regression
- demand-forecasting
- retail
- xgboost
- sklearn
- gcc
- agentic-commerce
- ocg-dubai
- gulf-retail
- e-commerce
library_name: sklearn
pipeline_tag: tabular-regression
---
# GCC Retail Demand Forecasting Model v2
> Built by [OCG Dubai](https://ocg-dubai.ae) β€” Agentic Commerce APIs for the GCC
An XGBoost Regressor model for predicting retail demand across 6 GCC countries and 12 product categories. Uses Gregorian calendar with regional event flags β€” no Hijri calendar dependency. Part of [OCG Dubai's](https://ocg-dubai.ae) Agentic Commerce APIs.
## Model Description
This model predicts the `demand_index` (0-100) for retail products in GCC countries (UAE, KSA, Qatar, Kuwait, Bahrain, Oman) across 12 realistic product categories based on actual e-commerce revenue data. It captures seasonal patterns including Ramadan, Eid, shopping festivals (DSF, Riyadh Season, White Friday), and country-specific events.
### Features Used
| Feature | Type | Description |
|---------|------|-------------|
| month | Integer (1-12) | Gregorian month |
| day_of_week | Integer (0-6) | Day of week (Monday=0) |
| is_weekend | Binary (0/1) | Friday/Saturday (GCC weekend) |
| country_encoded | Integer | Label-encoded country |
| category_encoded | Integer | Label-encoded product category |
| temperature | Float | Temperature in Celsius |
| is_ramadan | Binary (0/1) | Whether it is Ramadan |
| ramadan_week | Integer (0-4) | Week of Ramadan (0 if not Ramadan) |
| is_eid_fitr | Binary (0/1) | Eid al-Fitr period |
| is_eid_adha | Binary (0/1) | Eid al-Adha period |
| is_shopping_festival | Binary (0/1) | Dubai Shopping Festival, Riyadh Season, Shop Qatar, etc. |
| is_white_friday | Binary (0/1) | White Friday / Black Friday sales |
| is_national_day | Binary (0/1) | Country national day events |
| is_back_to_school | Binary (0/1) | Back-to-school season |
| year | Integer | Year (2018-2025) |
### Product Categories
12 categories based on real GCC e-commerce revenue shares:
- fashion_apparel (25-38% of revenue)
- electronics_media (19-34%)
- groceries_fmcg (15-30%)
- beauty_cosmetics (5-10%)
- home_furniture (3-8%)
- luxury_goods
- jewelry_watches
- health_wellness
- food_dining
- sports_outdoor
- toys_kids
- travel_entertainment
## Model Performance
| Metric | Value |
|--------|-------|
| R2 Score | 0.992 |
| RMSE | 2.94 |
| MAE | 2.28 |
Trained on 168,307 samples, tested on 42,077 samples.
### Feature Importance (Top 5)
1. **category_encoded** (80.1%) β€” Product category is the dominant predictor
2. **is_shopping_festival** (4.8%) β€” DSF, Riyadh Season, etc.
3. **is_eid_fitr** (3.3%) β€” Eid al-Fitr celebrations
4. **is_white_friday** (2.5%) β€” White Friday sales events
5. **ramadan_week** (2.2%) β€” Week within Ramadan
## Usage
```python
import joblib
import numpy as np
# Load model and encoders
model = joblib.load("model.joblib")
encoders = joblib.load("encoders.joblib")
country_encoder = encoders['country_encoder']
category_encoder = encoders['category_encoder']
# Prepare features
country_encoded = country_encoder.transform(["UAE"])[0]
category_encoded = category_encoder.transform(["fashion_apparel"])[0]
# Feature order: month, day_of_week, is_weekend, country_encoded, category_encoded,
# temperature, is_ramadan, ramadan_week, is_eid_fitr, is_eid_adha,
# is_shopping_festival, is_white_friday, is_national_day, is_back_to_school, year
features = np.array([[
1, # month (January)
4, # day_of_week (Friday)
1, # is_weekend
country_encoded,
category_encoded,
22.0, # temperature
0, # is_ramadan
0, # ramadan_week
0, # is_eid_fitr
0, # is_eid_adha
1, # is_shopping_festival (DSF in January)
0, # is_white_friday
0, # is_national_day
0, # is_back_to_school
2025 # year
]])
prediction = model.predict(features)
print(f"Predicted demand index: {prediction[0]:.2f}")
```
## Training Data
210,384 records across 6 GCC countries, 12 product categories, 2018-2025. Based on actual GCC retail market research:
- **UAE**: $114B market β€” tourism & luxury driven
- **KSA**: $161B market β€” largest by volume, Vision 2030 growth
- **Qatar**: $19.5B β€” high per-capita spend
- **Kuwait**: $22.6B β€” strong grocery/FMCG
- **Bahrain**: $8.5B β€” regional hub
- **Oman**: $12.0B β€” emerging e-commerce
Country-specific events: Dubai Shopping Festival, Riyadh Season, Shop Qatar, Hala February, Bahrain F1, Khareef Festival.
Dataset: [GencoDiv/gcc-ramadan-retail-patterns](https://huggingface.co/datasets/GencoDiv/gcc-ramadan-retail-patterns)
## Limitations
- Trained on synthetic data β€” fine-tune on real retail data before production use
- Predictions are most accurate within the feature ranges seen during training
- Country and category must be from the predefined lists
## Files
- `model.joblib` β€” Trained XGBoost model (sklearn-compatible)
- `model.json` β€” XGBoost model in JSON format
- `encoders.joblib` β€” Label encoders for country and category
- `config.json` β€” Model configuration and metadata
- `feature_importances.json` β€” Feature importance scores
## About OCG Dubai
[OCG Dubai](https://ocg-dubai.ae) builds Agentic Commerce APIs for the GCC market β€” demand forecasting, halal compliance, smart baskets, and dynamic pricing calibrated for regional consumer behavior.
- Website: [ocg-dubai.ae](https://ocg-dubai.ae)
## License
MIT License