| license: apache-2.0 | |
| library_name: scikit-learn | |
| tags: | |
| - tabular-regression | |
| - sales-forecast | |
| - gradient-boosting | |
| - cross-sectional | |
| datasets: | |
| - dev02chandan/sales-forecast-dataset | |
| metrics: | |
| - rmse | |
| - mae | |
| - mape | |
| - smape | |
| # Sales Forecast Model (GBR) | |
| **Task:** Predict `Product_Store_Sales_Total` from product and store attributes. | |
| **Data:** dev02chandan/sales-forecast-dataset (`raw/SuperKart.csv` with processed train/test under `processed/`). | |
| **Model:** GradientBoostingRegressor selected via GroupKFold CV on `Store_Id`. | |
| ## Test Metrics | |
| - CV RMSE: 1157.1346565946897 | |
| - RMSE: 1600.05837632221 | |
| - MAE: 1405.5687461646362 | |
| - MAPE: 27.069205177956633 | |
| - sMAPE: 32.25248697544593 | |
| ## Usage | |
| ```python | |
| from huggingface_hub import hf_hub_download | |
| import joblib, pandas as pd | |
| pkl_path = hf_hub_download(repo_id="dev02chandan/sales-forecast-model", filename="model.pkl", repo_type="model") | |
| model = joblib.load(pkl_path) | |
| # X must contain the same columns used in training (one-hot is inside the pipeline) | |
| # Example: | |
| # X = pd.DataFrame([...]) | |
| # y_pred = model.predict(X) |