Karish1
/

winning_model

Model card Files Files and versions

winning_model / README.md

Karish1's picture

Update README.md

74def70 verified 2 months ago

|

history blame contribute delete

3.31 kB

	Video:
	https://youtu.be/wvdXkOgies4

	# Gladiator Winning Model

	This repository contains a trained Gradient Boosting classifier used to predict gladiator fight outcomes.

	## Dataset
	The full dataset is available on Kaggle:
	https://www.kaggle.com/datasets/anthonytherrien/gladiator-combat-records-and-profiles-dataset

	A representative sample may be included in this repository for demonstration purposes.

	## Model Performance
	- F1-score: 0.910
	- Accuracy: ~90%
	- ROC-AUC: 0.970
	The Gradient Boosting model performed the best out of all tested classifiers.

	## Model Comparison

	Several models were tested, including Logistic Regression, Random Forest, and Gradient Boosting.
	In both the regression-style and classification tasks, Gradient Boosting consistently performed the best,
	showing higher accuracy and a better balance between precision and recall.
	This makes it the most reliable model for predicting gladiator outcomes.

	## Exploratory Data Analysis (EDA)
	Before modeling, the dataset was explored to understand the distribution of gladiator attributes and fighting outcomes.
	Outliers were handled carefully to ensure they did not distort model training.

	### Key Findings
	- The dataset contains detailed gladiator profiles, including age, height, weight, fighting style, armor type, victory count, and more.
	- Many features show clear relationships with the final battle outcome (Win/Loss), as the numerical features: Battle Experience, Public Favor.
	- Categorical features such as Gladiator Type, Weapon Type, Fighting Style, Crowd Appeal Techniques and Previous Occupation showed meaningful differences between winners and non-winners.
	- Correlation analysis indicated that experience-based features (e.g., previous wins) have stronger predictive power than purely physical attributes.

	Strong correlation between Wins and Public Favor & Battle Experience:
	![heat-map](https://cdn-uploads.huggingface.co/production/uploads/691478b16066c85edf046dd8/QM8tepRS9JD56F8kNCWDM.png)

	Categorical features with strong correlation with the target variable Wins:
	![Screenshot 2025-12-11 231821](https://cdn-uploads.huggingface.co/production/uploads/691478b16066c85edf046dd8/kSEAkE7ad8l5AqnEThm8H.png)

	### Data Cleaning Steps
	- Missing values were imputed or removed depending on relevance.
	- Boolean and categorical features were encoded into numeric form.
	- New engineered features such as BMI, interaction terms, and readiness scores were added to improve predictive performance.
	- The target variable was converted into a binary class using a median split (`Win` vs. `Not Win`).

	### Feature Insights
	During EDA, battle experience emerged as the strongest predictor, showing a very high correlation (~0.95) with the target outcome.
	However, because this feature was essentially a direct indicator of the final result, it created data leakage.
	To ensure a fair and realistic model, this feature was removed from training, which made the remaining predictors more meaningful and prevented overly optimistic performance.

	This EDA process provided important insights that shaped feature engineering and model selection.


	## Usage
	```python
	import pickle

	with open("gladiator_gradient_boosting_classifier.pkl", "rb") as f:
	model = pickle.load(f)
	]