🧠 Model Card: Walk-Forward AutoGluon Model (By Week)

📘 Overview

This model performs walk-forward training and evaluation for predicting NFL wide receiver (WR) receiving yards on a week-by-week basis using AutoGluon’s TabularPredictor. It leverages historical player embeddings, pregame contextual features, and weather/game metadata to iteratively train and test within each NFL season (2016–2025).

🧩 Model Details

Model Type: Walk-forward regression (AutoGluon TabularPredictor) Framework: AutoGluon Tabular Author: Sebastian Andreu License: MIT Primary Use: Predicting receiving_yards for each wide receiver before a game is played.

Key Idea

Instead of training one global model, this script re-trains weekly within each season, always using all prior weeks as training data and the next week as the test set. This ensures realistic forward-looking performance without data leakage.

⚙️ Data

Source Datasets

The model loads and concatenates season datasets from:

SebastianAndreu/24679_NFL_WR_Dataset_<YEAR>

for 2016 ≤ YEAR ≤ 2025.

Each dataset includes pregame features such as weather, team matchup, and Vegas lines.

Features Used

Pregame input variables:

defteam
posteam
surface
is_dome
is_rain
is_snow
is_clear
temp_f
humidity_pct
wind_mph
home_team
away_team
pregame_spread
pregame_total
passer_player_id
receiver_player_id

The dataset is also merged with player_historical_embeddings.csv, which provides dense numerical representations of player histories.

Target Variable

receiving_yards — the number of receiving yards gained by the WR in the upcoming game.

🧮 Training Procedure

Walk-Forward Logic

For each season:

Extract the total number of weeks in that season.
For each week W starting from 2:
- Train on data from weeks < W.
- Test on data from week W.
- Train a new AutoGluon model from scratch (10-minute time limit).
Collect predictions and evaluation metrics.

AutoGluon Configuration

TabularPredictor(
    label="receiving_yards",
    path=model_dir,
    verbosity=0
).fit(
    train_data=train[features + [target]],
    time_limit=600,
    presets="medium_quality_faster_train"
)

Time Limit: 600 seconds per week Preset: medium_quality_faster_train Verbosity: 0 (minimal logging)

📊 Evaluation

Metric

The model computes Mean Absolute Error (MAE) over all weekly predictions.

Output

After all walk-forward runs:

walkforward_predictions.csv — contains true vs. predicted values per week.
Columns:
- season
- week
- true
- pred
- error = |true - pred|

Example final output:

✅ Walk-forward complete!
Total predictions: 3,200
Mean Absolute Error: 12.47
📥 Saved: walkforward_predictions.csv

📢 Artifacts

Artifact	Description
`player_historical_embeddings.csv`	Precomputed player embeddings
`autogluon_walkforward/`	Directory of trained weekly models
`walkforward_predictions.csv`	Aggregated results of predictions
`SebastianAndreu/24679_NFL_WR_Dataset_<YEAR>`	Input datasets (2016–2025)

🧠 Intended Use

Goal: Predict individual WR performance before each NFL game. Primary Users: Sports analytics researchers, fantasy football data scientists, and betting modelers. Not intended for: Real-time in-game prediction or commercial wagering advice.

⚠️ Limitations

Training each week from scratch is computationally expensive.
Does not include injury or roster change data.
Embeddings rely on prior model quality (player_historical_embeddings.csv).
Accuracy varies across early vs. late season due to data availability.

🧩 Future Improvements

Incorporate transfer learning between seasons.
Add injury & snap count features.
Experiment with AutoGluon ensemble distillation to reduce retraining cost.
Combine with Model 1 embeddings pipeline for joint optimization.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

SebastianAndreu
/

2025-24679-NFL-Yards-Predictor