Karish1 commited on
Commit
2fafce8
·
verified ·
1 Parent(s): d682544

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -0
README.md ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Gladiator Winning Model
2
+
3
+ This repository contains a trained **Gradient Boosting classifier** used to predict gladiator fight outcomes.
4
+
5
+ ## Dataset
6
+ The full dataset is available on Kaggle:
7
+ https://www.kaggle.com/datasets/anthonytherrien/gladiator-combat-records-and-profiles-dataset
8
+
9
+ A representative sample may be included in this repository for demonstration purposes.
10
+
11
+ ## Model Performance
12
+ - **F1-score:** 0.910
13
+ - **Accuracy:** ~90%
14
+ - **ROC-AUC:** 0.970
15
+ The Gradient Boosting model performed the best out of all tested classifiers.
16
+
17
+ ## Model Comparison
18
+
19
+ Several models were tested, including Logistic Regression, Random Forest, and Gradient Boosting.
20
+ In both the regression-style and classification tasks, **Gradient Boosting consistently performed the best**,
21
+ showing higher accuracy and a better balance between precision and recall.
22
+ This makes it the most reliable model for predicting gladiator outcomes.
23
+
24
+
25
+ ## Usage
26
+ ```python
27
+ import pickle
28
+
29
+ with open("gladiator_gradient_boosting_classifier.pkl", "rb") as f:
30
+ model = pickle.load(f)
31
+ ]
32
+
33
+
34
+
35
+ ## Exploratory Data Analysis (EDA)
36
+ Before modeling, the dataset was explored to understand the distribution of gladiator attributes and fighting outcomes.
37
+ Outliers were handled carefully to ensure they did not distort model training.
38
+
39
+ ### Key Findings
40
+ - The dataset contains **detailed gladiator profiles**, including age, height, weight, fighting style, armor type, victory count, and more.
41
+ - Many features show **clear relationships with the final battle outcome** (Win/Loss), as the numerical features: Battle Experience, Public Favor.
42
+ - Categorical features such as **Gladiator Type, Weapon Type, Fighting Style, Crowd Appeal Techniques and Previous Occupation** showed meaningful differences between winners and non-winners.
43
+ - Correlation analysis indicated that **experience-based features** (e.g., previous wins) have stronger predictive power than purely physical attributes.
44
+
45
+ Strong correlation between Wins and Public Favor & Battle Experience:
46
+ ![heat-map](https://cdn-uploads.huggingface.co/production/uploads/691478b16066c85edf046dd8/QM8tepRS9JD56F8kNCWDM.png)
47
+
48
+ Categorical features with strong correlation with the target variable Wins:
49
+ ![Screenshot 2025-12-11 231821](https://cdn-uploads.huggingface.co/production/uploads/691478b16066c85edf046dd8/kSEAkE7ad8l5AqnEThm8H.png)
50
+
51
+ ### Data Cleaning Steps
52
+ - Missing values were imputed or removed depending on relevance.
53
+ - Boolean and categorical features were encoded into numeric form.
54
+ - New engineered features such as **BMI**, **interaction terms**, and **readiness scores** were added to improve predictive performance.
55
+ - The target variable was converted into a binary class using a median split (`Win` vs. `Not Win`).
56
+
57
+ ### Feature Insights
58
+ During EDA, battle experience emerged as the strongest predictor, showing a very high correlation (~0.95) with the target outcome.
59
+ However, because this feature was essentially a direct indicator of the final result, it created data leakage.
60
+ To ensure a fair and realistic model, this feature was removed from training, which made the remaining predictors more meaningful and prevented overly optimistic performance.
61
+
62
+ This EDA process provided important insights that shaped feature engineering and model selection.