Gladiator Winning Model

This repository contains a trained Gradient Boosting classifier used to predict gladiator fight outcomes.

Dataset

The full dataset is available on Kaggle:
https://www.kaggle.com/datasets/anthonytherrien/gladiator-combat-records-and-profiles-dataset

A representative sample may be included in this repository for demonstration purposes.

Model Performance

F1-score: 0.910
Accuracy: ~90%
ROC-AUC: 0.970
The Gradient Boosting model performed the best out of all tested classifiers.

Model Comparison

Several models were tested, including Logistic Regression, Random Forest, and Gradient Boosting. In both the regression-style and classification tasks, Gradient Boosting consistently performed the best, showing higher accuracy and a better balance between precision and recall. This makes it the most reliable model for predicting gladiator outcomes.

Exploratory Data Analysis (EDA)

Before modeling, the dataset was explored to understand the distribution of gladiator attributes and fighting outcomes. Outliers were handled carefully to ensure they did not distort model training.

Key Findings

The dataset contains detailed gladiator profiles, including age, height, weight, fighting style, armor type, victory count, and more.
Many features show clear relationships with the final battle outcome (Win/Loss), as the numerical features: Battle Experience, Public Favor.
Categorical features such as Gladiator Type, Weapon Type, Fighting Style, Crowd Appeal Techniques and Previous Occupation showed meaningful differences between winners and non-winners.
Correlation analysis indicated that experience-based features (e.g., previous wins) have stronger predictive power than purely physical attributes.

Strong correlation between Wins and Public Favor & Battle Experience:

Categorical features with strong correlation with the target variable Wins:

Data Cleaning Steps

Missing values were imputed or removed depending on relevance.
Boolean and categorical features were encoded into numeric form.
New engineered features such as BMI, interaction terms, and readiness scores were added to improve predictive performance.
The target variable was converted into a binary class using a median split (Win vs. Not Win).

Feature Insights

During EDA, battle experience emerged as the strongest predictor, showing a very high correlation (~0.95) with the target outcome. However, because this feature was essentially a direct indicator of the final result, it created data leakage. To ensure a fair and realistic model, this feature was removed from training, which made the remaining predictors more meaningful and prevented overly optimistic performance.

This EDA process provided important insights that shaped feature engineering and model selection.

Usage

import pickle

with open("gladiator_gradient_boosting_classifier.pkl", "rb") as f:
    model = pickle.load(f)
]

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support