Ringside Analytics โ Match Winner Predictor
Predicts the probability that a given pro wrestler wins a match, given pre-match state about both wrestlers and the match context.
XGBoost is the primary model (xgboost.joblib). A logistic-regression
baseline is included for reference (logistic_regression.joblib). Both share
the StandardScaler in scaler.joblib and the 35 features listed in
feature_columns.json.
โ ๏ธ Important framing โ the kayfabe problem
Pro wrestling outcomes are scripted. The training label records the booked outcome (who the writers decided wins), not athletic ability. This model therefore learns booking patterns, not skill. It is not, and cannot be, useful for betting.
The companion paper at https://tedrubin80.github.io/wrastlingfirst/paper.html walks through what this means for ML practice โ particularly why the validationโtest AUC gap of 25 points (0.952 โ 0.718) is structural rather than a methodological error.
Performance (test set โ the honest numbers)
| Model | Accuracy | AUC-ROC | Log loss |
|---|---|---|---|
| XGBoost | 0.662 | 0.718 | 0.636 |
| Logistic Regression | 0.643 | 0.698 | 0.646 |
Coin-flip baseline: 0.500 AUC. "Favored wrestler always wins" baseline: ~0.62 AUC.
Top feature importances (XGBoost)
current_win_streak(0.31)current_loss_streak(0.22)days_since_last_match(0.13)h2h_win_rate(0.09)is_royal_rumble(0.03)
Booking momentum (streaks) carries over half the model's signal. Removing the streak family in ablation drops test AUC to 0.541 โ barely above coin-flip.
Quickstart
import joblib
import pandas as pd
# Download artifacts
from huggingface_hub import hf_hub_download
xgb_path = hf_hub_download(repo_id="datamatters24/ringside-match-winner", filename="xgboost.joblib")
scaler_path = hf_hub_download(repo_id="datamatters24/ringside-match-winner", filename="scaler.joblib")
xgb = joblib.load(xgb_path)
scaler = joblib.load(scaler_path)
# X must be a DataFrame with exactly the 35 feature columns from feature_columns.json
# Reproduce feature engineering: see https://github.com/tedrubin80/wrastlingfirst/blob/main/ml/features.py
# Or use the prebuilt feature_matrix.parquet from the dataset:
# https://huggingface.co/datasets/datamatters24/ringside-analytics
X_scaled = scaler.transform(X)
proba = xgb.predict_proba(X_scaled)[:, 1] # P(win)
How to reproduce predictions exactly
The dataset bundles feature_matrix.parquet โ the exact 35-feature snapshot
used at training time. Loading that file and running the model gives identical
predictions to the served version.
import pandas as pd, joblib
from huggingface_hub import hf_hub_download
fm_path = hf_hub_download(
repo_id="datamatters24/ringside-analytics",
repo_type="dataset",
filename="feature_matrix.parquet",
)
fm = pd.read_parquet(fm_path)
# (Optional) honest temporal split
fm["event_date"] = pd.to_datetime(fm["event_date"])
test = fm[fm["event_date"] >= "2025-01-01"]
Limitations
- Selection bias toward televised matches. House-show / indie data is sparse.
- Kayfabe is not athletic skill. The model learns booking, not ability.
- Era drift. Booking philosophy has shifted over 40+ years; the model averages across eras.
- Gender imbalance. Women's-division sample is smaller; expect wider error bars.
- Single predictions are weakly informative. AUC 0.72 means meaningful lift over coin-flip but not betting-grade calibration.
Companion resources
- ๐ Dataset (HF): datamatters24/ringside-analytics
- ๐ Dataset (Kaggle): theodorerubin/ringside-wrestling-archive
- ๐ฐ Kaggle Model: theodorerubin/ringside-analytics-match-winner (mirror)
- ๐ Paper / portfolio: tedrubin80.github.io/wrastlingfirst
- ๐ป Source code: github.com/tedrubin80/wrastlingfirst
Citation
@misc{rubin2026ringside,
author = {Rubin, Theodore},
title = {Ringside Analytics: Match Winner Predictor},
year = {2026},
url = {https://huggingface.co/datamatters24/ringside-match-winner}
}
License
Apache 2.0 (model weights). Training data: CC0 (see linked dataset).
Dataset used to train datamatters24/ringside-match-winner
Space using datamatters24/ringside-match-winner 1
Evaluation results
- Test accuracy on Ringside Analytics Wrestling Archiveself-reported0.662
- Test AUC-ROC on Ringside Analytics Wrestling Archiveself-reported0.718
- Test log loss on Ringside Analytics Wrestling Archiveself-reported0.636