Titanic Survival Prediction Model

Binary classification model to predict passenger survival on the Titanic.

Best Model: GradientBoosting

Metric CV Score Test Score
Accuracy 0.8104 0.8101
F1 Score 0.7450 0.7463
AUC-ROC 0.8913 0.8422
Precision - 0.7692
Recall - 0.7246

All Models Compared

Model CV Accuracy CV F1 CV AUC Test Accuracy Test F1 Test AUC
LogisticRegression 0.8020 0.7402 0.8590 0.8045 0.7368 0.8594
RandomForest 0.8231 0.7569 0.8826 0.7877 0.7077 0.8473
GradientBoosting 🏆 0.8104 0.7450 0.8913 0.8101 0.7463 0.8422
XGBoost 0.8146 0.7475 0.8892 0.8045 0.7328 0.8428

Features Used

Engineered features from the raw Titanic dataset:

  • Passenger class (Pclass)
  • Age (imputed by title median)
  • Family features: SibSp, Parch, FamilySize, IsAlone
  • Fare & FarePerPerson
  • Cabin info: HasCabin, Deck
  • Title (extracted from Name: Mr, Mrs, Miss, Master, Rare)
  • Interactions: Age × Class
  • Embarked port
  • Age bins: Child, Teen, Adult, Middle, Senior

Usage

import joblib
import numpy as np

model = joblib.load("model.joblib")
scaler = joblib.load("scaler.joblib")
label_encoders = joblib.load("label_encoders.joblib")

# See config.json for feature_columns ordering

Dataset

Trained on phihung/titanic (891 passengers, 80/20 train/test split, stratified).

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train Hyperion912/titanic-survival-predictor