| ---
|
| language: en
|
| license: mit
|
| tags:
|
| - classification
|
| - tabular
|
| - titanic
|
| - survival-prediction
|
| - pytorch
|
| datasets:
|
| - titanic
|
| metrics:
|
| - accuracy
|
| - f1
|
| - precision
|
| - recall
|
| model-index:
|
| - name: Fine_Tuning_Dataset
|
| results:
|
| - task:
|
| type: tabular-classification
|
| dataset:
|
| name: Titanic
|
| type: titanic
|
| metrics:
|
| - type: accuracy
|
| value: 0.6111
|
| - type: f1
|
| value: 0.0
|
| - type: precision
|
| value: 0.0
|
| - type: recall
|
| value: 0.0
|
| ---
|
|
|
| # 🚢 Titanic Survival Classifier
|
|
|
| A lightweight MLP classifier wrapped in the Hugging Face `PreTrainedModel` interface,
|
| trained to predict passenger survival on the Titanic dataset.
|
|
|
| ## Model description
|
|
|
| | Component | Detail |
|
| |-----------|--------|
|
| | Architecture | 4-layer MLP with BatchNorm, GELU, Dropout |
|
| | Hidden dim | 128 |
|
| | Input features | 13 engineered tabular features |
|
| | Output | Binary (survived / not survived) |
|
| | Parameters | ~12,578 |
|
|
|
| ## Training details
|
|
|
| | Setting | Value |
|
| |---------|-------|
|
| | Optimizer | AdamW |
|
| | Learning rate | 0.001 |
|
| | Scheduler | Cosine annealing |
|
| | Epochs | 30 |
|
| | Batch size | 32 |
|
| | Train / Val / Test split | 80 / 10 / 10 % |
|
|
|
| ## Feature engineering
|
|
|
| Features used: `Pclass, Sex, Age, SibSp, Parch, Fare, Embarked, HasCabin, FamilySize, IsAlone, AgeBand, FareBand, Title`
|
|
|
| Key transformations applied:
|
| - **Title extraction** from passenger names (Mr, Mrs, Miss, Master, Rare)
|
| - **Age imputation** using median per title group
|
| - **FamilySize** = SibSp + Parch + 1; **IsAlone** flag
|
| - **HasCabin** binary flag
|
| - **AgeBand** and **FareBand** discretisation
|
| - StandardScaler normalisation (params saved in `scaler_params.json`)
|
|
|
| ## Test set performance
|
|
|
| | Metric | Score |
|
| |--------|-------|
|
| | Accuracy | 0.6111 |
|
| | Precision | 0.0 |
|
| | Recall | 0.0 |
|
| | F1-Score | 0.0 |
|
|
|
| ## How to use
|
|
|
| ```python
|
| import json, torch, numpy as np
|
| from huggingface_hub import hf_hub_download
|
| from transformers import PretrainedConfig, PreTrainedModel
|
|
|
| REPO = "Asimzaman19/Fine_Tuning_Dataset"
|
|
|
| # Load model
|
| model = TitanicClassifier.from_pretrained(REPO)
|
| model.eval()
|
|
|
| # Load scaler params
|
| params_path = hf_hub_download(REPO, "scaler_params.json")
|
| with open(params_path) as f:
|
| sp = json.load(f)
|
| mean = np.array(sp["mean"])
|
| scale = np.array(sp["scale"])
|
|
|
| # Prepare a sample (must match FEATURES order)
|
| raw = np.array([[3, 1, 22, 1, 0, 7.25, 0, 0, 2, 0, 1, 0, 0]], dtype=np.float32)
|
| scaled = ((raw - mean) / scale).astype(np.float32)
|
|
|
| with torch.no_grad():
|
| logits = model(torch.tensor(scaled)).logits
|
| pred = logits.argmax(-1).item()
|
| prob = torch.softmax(logits, dim=-1)[0, 1].item()
|
|
|
| print(f"Survived: {bool(pred)} (prob={prob:.2%})")
|
| ```
|
|
|
| ## Dataset
|
|
|
| The [Titanic dataset](https://www.kaggle.com/competitions/titanic) contains
|
| information about 891 passengers including demographics, ticket class, and
|
| fare — with the binary survival label as target.
|
|
|
| ## Limitations
|
|
|
| - Trained on a small historical dataset (891 rows); performance may not
|
| generalise beyond the Titanic domain.
|
| - Features are hand-engineered; a more robust pipeline would use automated
|
| feature selection.
|
|
|
| ## License
|
|
|
| MIT
|
|
|