GencoDiv commited on
Commit
fcf905c
·
verified ·
1 Parent(s): 7e26c47

Upload folder using huggingface_hub

Browse files
Files changed (8) hide show
  1. README.md +132 -0
  2. config.json +57 -0
  3. encoders.joblib +3 -0
  4. feature_importances.json +15 -0
  5. inference.py +227 -0
  6. model.joblib +3 -0
  7. model.json +0 -0
  8. requirements.txt +3 -0
README.md ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - tabular-regression
5
+ - demand-forecasting
6
+ - retail
7
+ - islamic-calendar
8
+ - ramadan
9
+ - xgboost
10
+ - sklearn
11
+ - gcc
12
+ library_name: sklearn
13
+ pipeline_tag: tabular-regression
14
+ ---
15
+
16
+ # GCC Ramadan Retail Demand Forecasting Model
17
+
18
+ An XGBoost Regressor model for predicting retail demand in GCC (Gulf Cooperation Council) countries based on Islamic calendar features.
19
+
20
+ ## Model Description
21
+
22
+ This model predicts the `demand_index` for retail products in GCC countries (UAE, KSA, Qatar, Kuwait, Bahrain, Oman) across various product categories. The model leverages Islamic calendar features including Ramadan timing, Eid celebrations, and Hajj season to capture the unique seasonal patterns in GCC retail markets.
23
+
24
+ ### Features Used
25
+
26
+ | Feature | Type | Description |
27
+ |---------|------|-------------|
28
+ | is_ramadan | Binary (0/1) | Whether it is Ramadan |
29
+ | ramadan_week | Integer (0-5) | Week of Ramadan (0 if not Ramadan) |
30
+ | days_to_eid | Integer (-1 to 30) | Days until Eid al-Fitr (-1 if not applicable) |
31
+ | is_eid_fitr | Binary (0/1) | Whether it is Eid al-Fitr |
32
+ | is_eid_adha | Binary (0/1) | Whether it is Eid al-Adha |
33
+ | is_hajj_season | Binary (0/1) | Whether it is Hajj season |
34
+ | country | Categorical | Country (UAE, KSA, Qatar, Kuwait, Bahrain, Oman) |
35
+ | category | Categorical | Product category |
36
+ | temperature | Float | Temperature in Celsius |
37
+ | day_of_week | Integer (0-6) | Day of week (Monday=0) |
38
+ | month | Integer (1-12) | Gregorian month |
39
+ | hijri_month | Integer (1-12) | Hijri month |
40
+ | hijri_day | Integer (1-30) | Hijri day |
41
+
42
+ ### Product Categories
43
+
44
+ - dates_sweets
45
+ - electronics
46
+ - fashion_abayas
47
+ - gifts
48
+ - groceries
49
+ - perfumes_oud
50
+
51
+ ## Model Performance
52
+
53
+ | Metric | Value |
54
+ |--------|-------|
55
+ | R2 Score | 0.91 |
56
+ | RMSE | 5.13 |
57
+ | MAE | 3.91 |
58
+
59
+ ### Feature Importance
60
+
61
+ The top 5 most important features:
62
+ 1. **is_hajj_season** (60.7%) - Hajj season is the strongest predictor
63
+ 2. **is_eid_adha** (9.9%) - Eid al-Adha celebrations
64
+ 3. **days_to_eid** (8.2%) - Proximity to Eid al-Fitr
65
+ 4. **ramadan_week** (8.0%) - Week of Ramadan
66
+ 5. **category_encoded** (4.4%) - Product category effects
67
+
68
+ ## Usage
69
+
70
+ ```python
71
+ import joblib
72
+ import numpy as np
73
+
74
+ # Load model and encoders
75
+ model = joblib.load("model.joblib")
76
+ encoders = joblib.load("encoders.joblib")
77
+ country_encoder = encoders['country_encoder']
78
+ category_encoder = encoders['category_encoder']
79
+
80
+ # Prepare features
81
+ country_encoded = country_encoder.transform(["UAE"])[0]
82
+ category_encoded = category_encoder.transform(["groceries"])[0]
83
+
84
+ # Feature order: is_ramadan, ramadan_week, days_to_eid, is_eid_fitr, is_eid_adha,
85
+ # is_hajj_season, country_encoded, category_encoded, temperature,
86
+ # day_of_week, month, hijri_month, hijri_day
87
+
88
+ features = np.array([[
89
+ 1, # is_ramadan
90
+ 2, # ramadan_week
91
+ 15, # days_to_eid
92
+ 0, # is_eid_fitr
93
+ 0, # is_eid_adha
94
+ 0, # is_hajj_season
95
+ country_encoded,
96
+ category_encoded,
97
+ 30.0, # temperature
98
+ 4, # day_of_week (Friday)
99
+ 4, # month
100
+ 9, # hijri_month (Ramadan)
101
+ 15 # hijri_day
102
+ ]])
103
+
104
+ prediction = model.predict(features)
105
+ print(f"Predicted demand index: {prediction[0]:.2f}")
106
+ ```
107
+
108
+ ## Training Data
109
+
110
+ The model was trained on a synthetic dataset representing 15+ years of retail demand data from 6 GCC countries across 6 product categories. The dataset captures seasonal patterns associated with:
111
+ - Ramadan shopping behavior
112
+ - Eid al-Fitr celebrations
113
+ - Eid al-Adha celebrations
114
+ - Hajj season
115
+
116
+ ## Limitations
117
+
118
+ - Model is trained on synthetic data and should be fine-tuned on real retail data before production use
119
+ - Predictions are most accurate within the feature ranges seen during training
120
+ - Country and category must be from the predefined lists
121
+
122
+ ## Files
123
+
124
+ - `model.joblib` - Trained XGBoost model (sklearn-compatible)
125
+ - `model.json` - XGBoost model in JSON format
126
+ - `encoders.joblib` - Label encoders for country and category
127
+ - `config.json` - Model configuration and metadata
128
+ - `feature_importances.json` - Feature importance scores
129
+
130
+ ## License
131
+
132
+ MIT License
config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "GradientBoostingRegressor",
3
+ "feature_columns": [
4
+ "is_ramadan",
5
+ "ramadan_week",
6
+ "days_to_eid",
7
+ "is_eid_fitr",
8
+ "is_eid_adha",
9
+ "is_hajj_season",
10
+ "country_encoded",
11
+ "category_encoded",
12
+ "temperature",
13
+ "day_of_week",
14
+ "month",
15
+ "hijri_month",
16
+ "hijri_day"
17
+ ],
18
+ "countries": [
19
+ "Bahrain",
20
+ "KSA",
21
+ "Kuwait",
22
+ "Oman",
23
+ "Qatar",
24
+ "UAE"
25
+ ],
26
+ "categories": [
27
+ "dates_sweets",
28
+ "electronics",
29
+ "fashion_abayas",
30
+ "gifts",
31
+ "groceries",
32
+ "perfumes_oud"
33
+ ],
34
+ "metrics": {
35
+ "mse": 24.44782802085175,
36
+ "rmse": 4.944474493902436,
37
+ "mae": 3.8228789470187383,
38
+ "r2_score": 0.905699037541998
39
+ },
40
+ "description": "GCC Ramadan Retail Demand Forecasting Model",
41
+ "input_features": {
42
+ "is_ramadan": "Boolean (0/1) - Whether it is Ramadan",
43
+ "ramadan_week": "Integer (0-5) - Week of Ramadan (0 if not Ramadan)",
44
+ "days_to_eid": "Integer (-1 to 30) - Days until Eid (-1 if not applicable)",
45
+ "is_eid_fitr": "Boolean (0/1) - Whether it is Eid al-Fitr",
46
+ "is_eid_adha": "Boolean (0/1) - Whether it is Eid al-Adha",
47
+ "is_hajj_season": "Boolean (0/1) - Whether it is Hajj season",
48
+ "country": "String - One of: UAE, KSA, Qatar, Kuwait, Bahrain, Oman",
49
+ "category": "String - Product category",
50
+ "temperature": "Float - Temperature in Celsius",
51
+ "day_of_week": "Integer (0-6) - Day of week",
52
+ "month": "Integer (1-12) - Gregorian month",
53
+ "hijri_month": "Integer (1-12) - Hijri month",
54
+ "hijri_day": "Integer (1-30) - Hijri day"
55
+ },
56
+ "output": "demand_index - Predicted retail demand index"
57
+ }
encoders.joblib ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:da84e6211fe94a186197f3850a18bf6c3aae1f58360000a75f8590b8c5a2cc0a
3
+ size 850
feature_importances.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "is_ramadan": 3.69275253834204e-05,
3
+ "ramadan_week": 0.003525943796247472,
4
+ "days_to_eid": 0.29290416306240935,
5
+ "is_eid_fitr": 0.0010144082359060686,
6
+ "is_eid_adha": 0.0012917867230512092,
7
+ "is_hajj_season": 0.10161264627734466,
8
+ "country_encoded": 0.11934693744055763,
9
+ "category_encoded": 0.3920205132356218,
10
+ "temperature": 0.022141510058809637,
11
+ "day_of_week": 0.0418962978935297,
12
+ "month": 0.02049986667570955,
13
+ "hijri_month": 0.00037238780289487874,
14
+ "hijri_day": 0.0033366112725345945
15
+ }
inference.py ADDED
@@ -0,0 +1,227 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ GCC Ramadan Retail Demand Forecasting - Inference Script
3
+
4
+ This script demonstrates how to use the trained demand forecasting model.
5
+ When downloaded from HuggingFace, this script works alongside model.joblib and encoders.joblib.
6
+
7
+ Usage:
8
+ python inference.py
9
+
10
+ Or import and use programmatically:
11
+ from inference import DemandForecaster
12
+ forecaster = DemandForecaster()
13
+ prediction = forecaster.predict(...)
14
+ """
15
+
16
+ import joblib
17
+ import json
18
+ import numpy as np
19
+ import os
20
+
21
+ # Get model directory (same directory as this script)
22
+ MODEL_DIR = os.path.dirname(os.path.abspath(__file__))
23
+
24
+
25
+ class DemandForecaster:
26
+ """Class for loading and using the GCC Ramadan demand forecasting model."""
27
+
28
+ def __init__(self, model_dir=None):
29
+ """
30
+ Initialize the forecaster by loading the model and encoders.
31
+
32
+ Args:
33
+ model_dir: Path to directory containing model files.
34
+ Defaults to same directory as this script.
35
+ """
36
+ self.model_dir = model_dir or MODEL_DIR
37
+ self._load_model()
38
+
39
+ def _load_model(self):
40
+ """Load the trained model, encoders, and configuration."""
41
+ # Load model
42
+ model_path = os.path.join(self.model_dir, "model.joblib")
43
+ self.model = joblib.load(model_path)
44
+
45
+ # Load encoders
46
+ encoders_path = os.path.join(self.model_dir, "encoders.joblib")
47
+ encoders = joblib.load(encoders_path)
48
+ self.country_encoder = encoders['country_encoder']
49
+ self.category_encoder = encoders['category_encoder']
50
+
51
+ # Load config
52
+ config_path = os.path.join(self.model_dir, "config.json")
53
+ with open(config_path, 'r') as f:
54
+ self.config = json.load(f)
55
+
56
+ self.countries = self.config['countries']
57
+ self.categories = self.config['categories']
58
+
59
+ def predict(self,
60
+ is_ramadan: int,
61
+ ramadan_week: int,
62
+ days_to_eid: int,
63
+ is_eid_fitr: int,
64
+ is_eid_adha: int,
65
+ is_hajj_season: int,
66
+ country: str,
67
+ category: str,
68
+ temperature: float,
69
+ day_of_week: int,
70
+ month: int,
71
+ hijri_month: int,
72
+ hijri_day: int) -> float:
73
+ """
74
+ Predict demand index for given features.
75
+
76
+ Args:
77
+ is_ramadan: 1 if Ramadan, 0 otherwise
78
+ ramadan_week: Week of Ramadan (1-5), 0 if not Ramadan
79
+ days_to_eid: Days until Eid al-Fitr (-1 if not applicable)
80
+ is_eid_fitr: 1 if Eid al-Fitr, 0 otherwise
81
+ is_eid_adha: 1 if Eid al-Adha, 0 otherwise
82
+ is_hajj_season: 1 if Hajj season, 0 otherwise
83
+ country: One of UAE, KSA, Qatar, Kuwait, Bahrain, Oman
84
+ category: Product category (dates_sweets, electronics, fashion_abayas,
85
+ gifts, groceries, perfumes_oud)
86
+ temperature: Temperature in Celsius
87
+ day_of_week: Day of week (0-6, Monday=0)
88
+ month: Gregorian month (1-12)
89
+ hijri_month: Hijri month (1-12)
90
+ hijri_day: Hijri day (1-30)
91
+
92
+ Returns:
93
+ Predicted demand index (typically 30-200 range)
94
+ """
95
+ # Validate inputs
96
+ if country not in self.countries:
97
+ raise ValueError(f"Invalid country: {country}. Must be one of {self.countries}")
98
+ if category not in self.categories:
99
+ raise ValueError(f"Invalid category: {category}. Must be one of {self.categories}")
100
+
101
+ # Encode categorical features
102
+ country_encoded = self.country_encoder.transform([country])[0]
103
+ category_encoded = self.category_encoder.transform([category])[0]
104
+
105
+ # Create feature vector
106
+ features = np.array([[
107
+ is_ramadan,
108
+ ramadan_week,
109
+ days_to_eid,
110
+ is_eid_fitr,
111
+ is_eid_adha,
112
+ is_hajj_season,
113
+ country_encoded,
114
+ category_encoded,
115
+ temperature,
116
+ day_of_week,
117
+ month,
118
+ hijri_month,
119
+ hijri_day
120
+ ]])
121
+
122
+ # Make prediction
123
+ prediction = self.model.predict(features)[0]
124
+ return prediction
125
+
126
+ def predict_dict(self, data: dict) -> float:
127
+ """
128
+ Predict demand index from a dictionary of features.
129
+
130
+ Args:
131
+ data: Dictionary with keys matching the predict() parameters
132
+
133
+ Returns:
134
+ Predicted demand index
135
+ """
136
+ return self.predict(**data)
137
+
138
+ def predict_batch(self, data_list: list) -> list:
139
+ """
140
+ Predict demand index for multiple records.
141
+
142
+ Args:
143
+ data_list: List of dictionaries with feature values
144
+
145
+ Returns:
146
+ List of predicted demand indices
147
+ """
148
+ return [self.predict(**record) for record in data_list]
149
+
150
+
151
+ def demo():
152
+ """Demonstrate the model with example predictions."""
153
+ print("=" * 60)
154
+ print("GCC Ramadan Retail Demand Forecasting - Demo")
155
+ print("=" * 60)
156
+
157
+ # Initialize forecaster
158
+ forecaster = DemandForecaster()
159
+
160
+ print(f"\nAvailable countries: {forecaster.countries}")
161
+ print(f"Available categories: {forecaster.categories}")
162
+ print(f"\nModel metrics: R2={forecaster.config['metrics']['r2_score']:.3f}, "
163
+ f"RMSE={forecaster.config['metrics']['rmse']:.2f}")
164
+
165
+ print("\n" + "-" * 60)
166
+ print("Example Predictions:")
167
+ print("-" * 60)
168
+
169
+ examples = [
170
+ {
171
+ "name": "Normal day in UAE (groceries)",
172
+ "params": {
173
+ "is_ramadan": 0, "ramadan_week": 0, "days_to_eid": -1,
174
+ "is_eid_fitr": 0, "is_eid_adha": 0, "is_hajj_season": 0,
175
+ "country": "UAE", "category": "groceries", "temperature": 25.0,
176
+ "day_of_week": 5, "month": 6, "hijri_month": 11, "hijri_day": 15
177
+ }
178
+ },
179
+ {
180
+ "name": "Ramadan Week 2 in KSA (dates_sweets)",
181
+ "params": {
182
+ "is_ramadan": 1, "ramadan_week": 2, "days_to_eid": 15,
183
+ "is_eid_fitr": 0, "is_eid_adha": 0, "is_hajj_season": 0,
184
+ "country": "KSA", "category": "dates_sweets", "temperature": 30.0,
185
+ "day_of_week": 4, "month": 4, "hijri_month": 9, "hijri_day": 15
186
+ }
187
+ },
188
+ {
189
+ "name": "Eid al-Fitr in Qatar (gifts)",
190
+ "params": {
191
+ "is_ramadan": 0, "ramadan_week": 0, "days_to_eid": 0,
192
+ "is_eid_fitr": 1, "is_eid_adha": 0, "is_hajj_season": 0,
193
+ "country": "Qatar", "category": "gifts", "temperature": 35.0,
194
+ "day_of_week": 0, "month": 5, "hijri_month": 10, "hijri_day": 1
195
+ }
196
+ },
197
+ {
198
+ "name": "Hajj season in KSA (perfumes_oud)",
199
+ "params": {
200
+ "is_ramadan": 0, "ramadan_week": 0, "days_to_eid": -1,
201
+ "is_eid_fitr": 0, "is_eid_adha": 0, "is_hajj_season": 1,
202
+ "country": "KSA", "category": "perfumes_oud", "temperature": 40.0,
203
+ "day_of_week": 3, "month": 7, "hijri_month": 12, "hijri_day": 8
204
+ }
205
+ },
206
+ {
207
+ "name": "Eid al-Adha in Kuwait (fashion_abayas)",
208
+ "params": {
209
+ "is_ramadan": 0, "ramadan_week": 0, "days_to_eid": -1,
210
+ "is_eid_fitr": 0, "is_eid_adha": 1, "is_hajj_season": 1,
211
+ "country": "Kuwait", "category": "fashion_abayas", "temperature": 42.0,
212
+ "day_of_week": 5, "month": 7, "hijri_month": 12, "hijri_day": 10
213
+ }
214
+ }
215
+ ]
216
+
217
+ for i, example in enumerate(examples, 1):
218
+ pred = forecaster.predict(**example["params"])
219
+ print(f"\n{i}. {example['name']}: {pred:.2f}")
220
+
221
+ print("\n" + "=" * 60)
222
+ print("Demo complete!")
223
+ print("=" * 60)
224
+
225
+
226
+ if __name__ == "__main__":
227
+ demo()
model.joblib ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:69bef2bf5c0f86a0819e3a66afcf5a196fa3335ba8e17d87e3b6e31a7c69401a
3
+ size 1760473
model.json ADDED
The diff for this file is too large to render. See raw diff
 
requirements.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ scikit-learn>=1.3.0
2
+ joblib>=1.3.0
3
+ numpy>=1.24.0