GencoDiv commited on
Commit
5231c91
Β·
verified Β·
1 Parent(s): 6873db5

Update README for v2: 12 categories, R2=0.992, no Hijri

Browse files
Files changed (1) hide show
  1. README.md +81 -62
README.md CHANGED
@@ -4,70 +4,81 @@ tags:
4
  - tabular-regression
5
  - demand-forecasting
6
  - retail
7
- - islamic-calendar
8
- - ramadan
9
  - xgboost
10
  - sklearn
11
  - gcc
12
  - agentic-commerce
13
  - ocg-dubai
 
 
14
  library_name: sklearn
15
  pipeline_tag: tabular-regression
16
  ---
17
 
18
- # GCC Ramadan Retail Demand Forecasting Model
19
 
20
- > Built by [OCG Dubai](https://ocg-dubai.ae) β€” Agentic Commerce for the GCC
21
 
22
- An XGBoost Regressor model for predicting retail demand in GCC (Gulf Cooperation Council) countries based on Islamic calendar features. Part of [OCG Dubai's](https://ocg-dubai.ae) Agentic Commerce APIs for the Gulf region.
23
 
24
  ## Model Description
25
 
26
- This model predicts the `demand_index` for retail products in GCC countries (UAE, KSA, Qatar, Kuwait, Bahrain, Oman) across various product categories. The model leverages Islamic calendar features including Ramadan timing, Eid celebrations, and Hajj season to capture the unique seasonal patterns in GCC retail markets.
27
 
28
  ### Features Used
29
 
30
  | Feature | Type | Description |
31
  |---------|------|-------------|
32
- | is_ramadan | Binary (0/1) | Whether it is Ramadan |
33
- | ramadan_week | Integer (0-5) | Week of Ramadan (0 if not Ramadan) |
34
- | days_to_eid | Integer (-1 to 30) | Days until Eid al-Fitr (-1 if not applicable) |
35
- | is_eid_fitr | Binary (0/1) | Whether it is Eid al-Fitr |
36
- | is_eid_adha | Binary (0/1) | Whether it is Eid al-Adha |
37
- | is_hajj_season | Binary (0/1) | Whether it is Hajj season |
38
- | country | Categorical | Country (UAE, KSA, Qatar, Kuwait, Bahrain, Oman) |
39
- | category | Categorical | Product category |
40
- | temperature | Float | Temperature in Celsius |
41
- | day_of_week | Integer (0-6) | Day of week (Monday=0) |
42
  | month | Integer (1-12) | Gregorian month |
43
- | hijri_month | Integer (1-12) | Hijri month |
44
- | hijri_day | Integer (1-30) | Hijri day |
 
 
 
 
 
 
 
 
 
 
 
 
45
 
46
  ### Product Categories
47
 
48
- - dates_sweets
49
- - electronics
50
- - fashion_abayas
51
- - gifts
52
- - groceries
53
- - perfumes_oud
 
 
 
 
 
 
 
 
54
 
55
  ## Model Performance
56
 
57
  | Metric | Value |
58
  |--------|-------|
59
- | R2 Score | 0.91 |
60
- | RMSE | 5.13 |
61
- | MAE | 3.91 |
62
 
63
- ### Feature Importance
64
 
65
- The top 5 most important features:
66
- 1. **is_hajj_season** (60.7%) - Hajj season is the strongest predictor
67
- 2. **is_eid_adha** (9.9%) - Eid al-Adha celebrations
68
- 3. **days_to_eid** (8.2%) - Proximity to Eid al-Fitr
69
- 4. **ramadan_week** (8.0%) - Week of Ramadan
70
- 5. **category_encoded** (4.4%) - Product category effects
 
71
 
72
  ## Usage
73
 
@@ -83,26 +94,28 @@ category_encoder = encoders['category_encoder']
83
 
84
  # Prepare features
85
  country_encoded = country_encoder.transform(["UAE"])[0]
86
- category_encoded = category_encoder.transform(["groceries"])[0]
87
 
88
- # Feature order: is_ramadan, ramadan_week, days_to_eid, is_eid_fitr, is_eid_adha,
89
- # is_hajj_season, country_encoded, category_encoded, temperature,
90
- # day_of_week, month, hijri_month, hijri_day
91
 
92
  features = np.array([[
93
- 1, # is_ramadan
94
- 2, # ramadan_week
95
- 15, # days_to_eid
96
- 0, # is_eid_fitr
97
- 0, # is_eid_adha
98
- 0, # is_hajj_season
99
  country_encoded,
100
  category_encoded,
101
- 30.0, # temperature
102
- 4, # day_of_week (Friday)
103
- 4, # month
104
- 9, # hijri_month (Ramadan)
105
- 15 # hijri_day
 
 
 
 
 
106
  ]])
107
 
108
  prediction = model.predict(features)
@@ -111,32 +124,38 @@ print(f"Predicted demand index: {prediction[0]:.2f}")
111
 
112
  ## Training Data
113
 
114
- The model was trained on a synthetic dataset representing 15+ years of retail demand data from 6 GCC countries across 6 product categories. The dataset captures seasonal patterns associated with:
115
- - Ramadan shopping behavior
116
- - Eid al-Fitr celebrations
117
- - Eid al-Adha celebrations
118
- - Hajj season
 
 
 
 
 
 
 
119
 
120
  ## Limitations
121
 
122
- - Model is trained on synthetic data and should be fine-tuned on real retail data before production use
123
  - Predictions are most accurate within the feature ranges seen during training
124
  - Country and category must be from the predefined lists
125
 
126
  ## Files
127
 
128
- - `model.joblib` - Trained XGBoost model (sklearn-compatible)
129
- - `model.json` - XGBoost model in JSON format
130
- - `encoders.joblib` - Label encoders for country and category
131
- - `config.json` - Model configuration and metadata
132
- - `feature_importances.json` - Feature importance scores
133
 
134
  ## About OCG Dubai
135
 
136
- [OCG Dubai](https://ocg-dubai.ae) builds Agentic Commerce APIs for the GCC market β€” demand forecasting, halal compliance, and dynamic pricing calibrated for Islamic calendar seasonality and regional consumer behavior.
137
 
138
  - Website: [ocg-dubai.ae](https://ocg-dubai.ae)
139
- - Dataset: [GencoDiv/gcc-ramadan-retail-patterns](https://huggingface.co/datasets/GencoDiv/gcc-ramadan-retail-patterns)
140
 
141
  ## License
142
 
 
4
  - tabular-regression
5
  - demand-forecasting
6
  - retail
 
 
7
  - xgboost
8
  - sklearn
9
  - gcc
10
  - agentic-commerce
11
  - ocg-dubai
12
+ - gulf-retail
13
+ - e-commerce
14
  library_name: sklearn
15
  pipeline_tag: tabular-regression
16
  ---
17
 
18
+ # GCC Retail Demand Forecasting Model v2
19
 
20
+ > Built by [OCG Dubai](https://ocg-dubai.ae) β€” Agentic Commerce APIs for the GCC
21
 
22
+ An XGBoost Regressor model for predicting retail demand across 6 GCC countries and 12 product categories. Uses Gregorian calendar with regional event flags β€” no Hijri calendar dependency. Part of [OCG Dubai's](https://ocg-dubai.ae) Agentic Commerce APIs.
23
 
24
  ## Model Description
25
 
26
+ This model predicts the `demand_index` (0-100) for retail products in GCC countries (UAE, KSA, Qatar, Kuwait, Bahrain, Oman) across 12 realistic product categories based on actual e-commerce revenue data. It captures seasonal patterns including Ramadan, Eid, shopping festivals (DSF, Riyadh Season, White Friday), and country-specific events.
27
 
28
  ### Features Used
29
 
30
  | Feature | Type | Description |
31
  |---------|------|-------------|
 
 
 
 
 
 
 
 
 
 
32
  | month | Integer (1-12) | Gregorian month |
33
+ | day_of_week | Integer (0-6) | Day of week (Monday=0) |
34
+ | is_weekend | Binary (0/1) | Friday/Saturday (GCC weekend) |
35
+ | country_encoded | Integer | Label-encoded country |
36
+ | category_encoded | Integer | Label-encoded product category |
37
+ | temperature | Float | Temperature in Celsius |
38
+ | is_ramadan | Binary (0/1) | Whether it is Ramadan |
39
+ | ramadan_week | Integer (0-4) | Week of Ramadan (0 if not Ramadan) |
40
+ | is_eid_fitr | Binary (0/1) | Eid al-Fitr period |
41
+ | is_eid_adha | Binary (0/1) | Eid al-Adha period |
42
+ | is_shopping_festival | Binary (0/1) | Dubai Shopping Festival, Riyadh Season, Shop Qatar, etc. |
43
+ | is_white_friday | Binary (0/1) | White Friday / Black Friday sales |
44
+ | is_national_day | Binary (0/1) | Country national day events |
45
+ | is_back_to_school | Binary (0/1) | Back-to-school season |
46
+ | year | Integer | Year (2018-2025) |
47
 
48
  ### Product Categories
49
 
50
+ 12 categories based on real GCC e-commerce revenue shares:
51
+
52
+ - fashion_apparel (25-38% of revenue)
53
+ - electronics_media (19-34%)
54
+ - groceries_fmcg (15-30%)
55
+ - beauty_cosmetics (5-10%)
56
+ - home_furniture (3-8%)
57
+ - luxury_goods
58
+ - jewelry_watches
59
+ - health_wellness
60
+ - food_dining
61
+ - sports_outdoor
62
+ - toys_kids
63
+ - travel_entertainment
64
 
65
  ## Model Performance
66
 
67
  | Metric | Value |
68
  |--------|-------|
69
+ | R2 Score | 0.992 |
70
+ | RMSE | 2.94 |
71
+ | MAE | 2.28 |
72
 
73
+ Trained on 168,307 samples, tested on 42,077 samples.
74
 
75
+ ### Feature Importance (Top 5)
76
+
77
+ 1. **category_encoded** (80.1%) β€” Product category is the dominant predictor
78
+ 2. **is_shopping_festival** (4.8%) β€” DSF, Riyadh Season, etc.
79
+ 3. **is_eid_fitr** (3.3%) β€” Eid al-Fitr celebrations
80
+ 4. **is_white_friday** (2.5%) β€” White Friday sales events
81
+ 5. **ramadan_week** (2.2%) β€” Week within Ramadan
82
 
83
  ## Usage
84
 
 
94
 
95
  # Prepare features
96
  country_encoded = country_encoder.transform(["UAE"])[0]
97
+ category_encoded = category_encoder.transform(["fashion_apparel"])[0]
98
 
99
+ # Feature order: month, day_of_week, is_weekend, country_encoded, category_encoded,
100
+ # temperature, is_ramadan, ramadan_week, is_eid_fitr, is_eid_adha,
101
+ # is_shopping_festival, is_white_friday, is_national_day, is_back_to_school, year
102
 
103
  features = np.array([[
104
+ 1, # month (January)
105
+ 4, # day_of_week (Friday)
106
+ 1, # is_weekend
 
 
 
107
  country_encoded,
108
  category_encoded,
109
+ 22.0, # temperature
110
+ 0, # is_ramadan
111
+ 0, # ramadan_week
112
+ 0, # is_eid_fitr
113
+ 0, # is_eid_adha
114
+ 1, # is_shopping_festival (DSF in January)
115
+ 0, # is_white_friday
116
+ 0, # is_national_day
117
+ 0, # is_back_to_school
118
+ 2025 # year
119
  ]])
120
 
121
  prediction = model.predict(features)
 
124
 
125
  ## Training Data
126
 
127
+ 210,384 records across 6 GCC countries, 12 product categories, 2018-2025. Based on actual GCC retail market research:
128
+
129
+ - **UAE**: $114B market β€” tourism & luxury driven
130
+ - **KSA**: $161B market β€” largest by volume, Vision 2030 growth
131
+ - **Qatar**: $19.5B β€” high per-capita spend
132
+ - **Kuwait**: $22.6B β€” strong grocery/FMCG
133
+ - **Bahrain**: $8.5B β€” regional hub
134
+ - **Oman**: $12.0B β€” emerging e-commerce
135
+
136
+ Country-specific events: Dubai Shopping Festival, Riyadh Season, Shop Qatar, Hala February, Bahrain F1, Khareef Festival.
137
+
138
+ Dataset: [GencoDiv/gcc-ramadan-retail-patterns](https://huggingface.co/datasets/GencoDiv/gcc-ramadan-retail-patterns)
139
 
140
  ## Limitations
141
 
142
+ - Trained on synthetic data β€” fine-tune on real retail data before production use
143
  - Predictions are most accurate within the feature ranges seen during training
144
  - Country and category must be from the predefined lists
145
 
146
  ## Files
147
 
148
+ - `model.joblib` β€” Trained XGBoost model (sklearn-compatible)
149
+ - `model.json` β€” XGBoost model in JSON format
150
+ - `encoders.joblib` β€” Label encoders for country and category
151
+ - `config.json` β€” Model configuration and metadata
152
+ - `feature_importances.json` β€” Feature importance scores
153
 
154
  ## About OCG Dubai
155
 
156
+ [OCG Dubai](https://ocg-dubai.ae) builds Agentic Commerce APIs for the GCC market β€” demand forecasting, halal compliance, smart baskets, and dynamic pricing calibrated for regional consumer behavior.
157
 
158
  - Website: [ocg-dubai.ae](https://ocg-dubai.ae)
 
159
 
160
  ## License
161