Spaces:
Sleeping
title: Rossmann Store Sales Forecasting
emoji: 📈
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
Rossmann Store Sales Forecasting
This repository is a small end-to-end machine learning and MLOps learning project built around the Rossmann Store Sales dataset. The goal is to predict daily store sales from tabular retail data, evaluate the model with time-aware validation, and expose the trained model through a lightweight API and demo interface.
Live demo: Hugging Face Space
Overview
This project focuses on a compact but complete forecasting workflow:
- merge historical sales with store metadata
- engineer calendar, holiday, and store-level features
- train an XGBoost regressor on
log1p(Sales) - evaluate with a strict time-based holdout split and rolling backtests
- track runs locally with MLflow
- serve predictions through FastAPI and a small browser demo
I kept the project intentionally small. The emphasis is not on building a large platform, but on showing a coherent ML workflow with a thin deployment layer.
Demo Snapshot
Workflow
flowchart LR
A["Raw sales data<br/>train.csv + store.csv"] --> B["Cleaning and feature engineering"]
B --> C["XGBoost training"]
C --> D["Time-based holdout + rolling backtest"]
C --> E["Saved model artifact"]
E --> F["FastAPI prediction service"]
F --> G["Browser demo / API requests"]
Method
The prediction target is daily store sales Sales. The dataset comes from the Rossmann Kaggle competition and uses train.csv plus store.csv, with 1,017,209 raw rows and 844,338 rows after removing closed stores and zero-sales records.
The pipeline fills missing competition and promo fields, encodes store and holiday categories, and trains on log1p(Sales). The final feature set has 28 columns built from:
- calendar features such as
DayOfWeek,Month, andIsWeekend - promotion and holiday indicators such as
Promo,StateHoliday, andSchoolHoliday - store metadata such as
StoreType,Assortment, andCompetitionDistance - engineered signals such as
LogCompetitionDistance, Easter features, and Fourier seasonality terms
The model is an XGBoost regressor. This keeps the project compact while still fitting the tabular structure of the problem well.
Current training parameters:
n_estimators: 500
learning_rate: 0.05
max_depth: 10
subsample: 0.8
colsample_bytree: 0.8
objective: reg:squarederror
random_state: 42
Validation uses the last 42 days as a holdout window (2015-06-20 to 2015-07-31) plus 3 rolling backtest windows. The main evaluation metric is RMSPE.
Results
Performance is evaluated with RMSPE, which is a useful relative error metric for store sales forecasting. The project uses a strict 42-day time-based holdout split instead of a random train/validation split, and also runs 3 rolling backtests to check whether gains remain stable across multiple forecast windows. Model performance is always compared against a simple historical baseline built from store and day-of-week averages.
Holdout Results
| Method | Train RMSPE | Validation RMSPE | Notes |
|---|---|---|---|
| Baseline | - | 23.5604 | Store and day-of-week historical mean |
| Pre-tuning XGBoost | 18.3328 | 16.0662 | Initial configuration |
| Tuned XGBoost | 11.3646 | 12.8545 | Final selected model |
Rolling Backtest Summary
| Metric | Value |
|---|---|
| Average tuned RMSPE | 13.2412 |
| Average baseline RMSPE | 22.9997 |
| Average improvement vs baseline | 9.7585 |
What these results mean
- The tuned model improves over the simple baseline by about
10.71RMSPE points on the final holdout window. - Across
3rolling windows, the tuned model remains consistently better than the baseline. - The weakest backtest window is
2015-05-09to2015-06-19, which suggests the model is more sensitive in some seasonal or promotion periods than others.
What This Project Demonstrates
- tabular forecasting with explicit feature engineering
- time-aware evaluation through holdout and rolling backtests
- local MLflow tracking, saved model artifacts, and model metadata
- FastAPI serving, Dockerized local inference, CI checks, and offline drift checks
Project Structure
src/training/ data loading, feature engineering, split helpers, model training
src/serving/ FastAPI prediction service and inference logging
src/shared/ config, MLflow helper, and API schemas
scripts/ evaluation, drift check, and test runner
web/ minimal browser demo
metrics/ saved training and evaluation outputs
tests/ unit tests for pipeline, serving, and split logic
Dockerfile minimal container image for inference
How To Run
Install dependencies:
pip install -r requirements.txt
Train the model:
make train
Run evaluation:
make evaluate
This writes:
models/rossmann_model.jsonmodels/model_metadata.jsonmetrics/training_summary.jsonmetrics/model_evaluation.json
If mlflow is installed, training and evaluation runs are also logged locally under mlruns/.
Run the API demo:
make run
Then open http://localhost:7860.
Run tests:
make test
Build the Docker image for local inference:
make docker-build
make docker-run
Generate an offline drift report from logged inference requests:
make drift-check
Limitations
- This is a compact forecasting and deployment demo, not a production system.
- Feature engineering is intentionally simple and mostly manual.
- The explanation output is a model contribution view, not a causal interpretation.
