๐ง Model Card: Walk-Forward AutoGluon Model (By Week)
๐ Overview
This model performs walk-forward training and evaluation for predicting NFL wide receiver (WR) receiving yards on a week-by-week basis using AutoGluonโs TabularPredictor. It leverages historical player embeddings, pregame contextual features, and weather/game metadata to iteratively train and test within each NFL season (2016โ2025).
๐งฉ Model Details
Model Type: Walk-forward regression (AutoGluon TabularPredictor) Framework: AutoGluon Tabular Author: Sebastian Andreu License: MIT Primary Use: Predicting receiving_yards for each wide receiver before a game is played.
Key Idea
Instead of training one global model, this script re-trains weekly within each season, always using all prior weeks as training data and the next week as the test set. This ensures realistic forward-looking performance without data leakage.
โ๏ธ Data
Source Datasets
The model loads and concatenates season datasets from:
SebastianAndreu/24679_NFL_WR_Dataset_<YEAR>
for 2016 โค YEAR โค 2025.
Each dataset includes pregame features such as weather, team matchup, and Vegas lines.
Features Used
Pregame input variables:
defteamposteamsurfaceis_domeis_rainis_snowis_cleartemp_fhumidity_pctwind_mphhome_teamaway_teampregame_spreadpregame_totalpasser_player_idreceiver_player_id
The dataset is also merged with player_historical_embeddings.csv, which provides dense numerical representations of player histories.
Target Variable
receiving_yards โ the number of receiving yards gained by the WR in the upcoming game.
๐งฎ Training Procedure
Walk-Forward Logic
For each season:
Extract the total number of weeks in that season.
For each week
Wstarting from 2:- Train on data from weeks
< W. - Test on data from week
W. - Train a new AutoGluon model from scratch (10-minute time limit).
- Train on data from weeks
Collect predictions and evaluation metrics.
AutoGluon Configuration
TabularPredictor(
label="receiving_yards",
path=model_dir,
verbosity=0
).fit(
train_data=train[features + [target]],
time_limit=600,
presets="medium_quality_faster_train"
)
Time Limit: 600 seconds per week
Preset: medium_quality_faster_train
Verbosity: 0 (minimal logging)
๐ Evaluation
Metric
The model computes Mean Absolute Error (MAE) over all weekly predictions.
Output
After all walk-forward runs:
walkforward_predictions.csvโ contains true vs. predicted values per week.Columns:
seasonweektrueprederror = |true - pred|
Example final output:
โ
Walk-forward complete!
Total predictions: 3,200
Mean Absolute Error: 12.47
๐ฅ Saved: walkforward_predictions.csv
๐ข Artifacts
| Artifact | Description |
|---|---|
player_historical_embeddings.csv |
Precomputed player embeddings |
autogluon_walkforward/ |
Directory of trained weekly models |
walkforward_predictions.csv |
Aggregated results of predictions |
SebastianAndreu/24679_NFL_WR_Dataset_<YEAR> |
Input datasets (2016โ2025) |
๐ง Intended Use
Goal: Predict individual WR performance before each NFL game. Primary Users: Sports analytics researchers, fantasy football data scientists, and betting modelers. Not intended for: Real-time in-game prediction or commercial wagering advice.
โ ๏ธ Limitations
- Training each week from scratch is computationally expensive.
- Does not include injury or roster change data.
- Embeddings rely on prior model quality (
player_historical_embeddings.csv). - Accuracy varies across early vs. late season due to data availability.
๐งฉ Future Improvements
- Incorporate transfer learning between seasons.
- Add injury & snap count features.
- Experiment with AutoGluon ensemble distillation to reduce retraining cost.
- Combine with Model 1 embeddings pipeline for joint optimization.