File size: 3,361 Bytes
d60dc2d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
---

language: en
license: mit
tags:
  - classification
  - tabular
  - titanic
  - survival-prediction
  - pytorch
datasets:
  - titanic
metrics:
  - accuracy
  - f1
  - precision
  - recall
model-index:
  - name: Fine_Tuning_Dataset
    results:
      - task:
          type: tabular-classification
        dataset:
          name: Titanic
          type: titanic
        metrics:
          - type: accuracy
            value: 0.6111
          - type: f1
            value: 0.0
          - type: precision
            value: 0.0
          - type: recall
            value: 0.0
---


# 🚢 Titanic Survival Classifier

A lightweight MLP classifier wrapped in the Hugging Face `PreTrainedModel` interface,
trained to predict passenger survival on the Titanic dataset.

## Model description

| Component | Detail |
|-----------|--------|
| Architecture | 4-layer MLP with BatchNorm, GELU, Dropout |
| Hidden dim | 128 |
| Input features | 13 engineered tabular features |
| Output | Binary (survived / not survived) |
| Parameters | ~12,578 |

## Training details

| Setting | Value |
|---------|-------|
| Optimizer | AdamW |
| Learning rate | 0.001 |
| Scheduler | Cosine annealing |
| Epochs | 30 |
| Batch size | 32 |
| Train / Val / Test split | 80 / 10 / 10 % |

## Feature engineering

Features used: `Pclass, Sex, Age, SibSp, Parch, Fare, Embarked, HasCabin, FamilySize, IsAlone, AgeBand, FareBand, Title`

Key transformations applied:
- **Title extraction** from passenger names (Mr, Mrs, Miss, Master, Rare)
- **Age imputation** using median per title group
- **FamilySize** = SibSp + Parch + 1; **IsAlone** flag
- **HasCabin** binary flag
- **AgeBand** and **FareBand** discretisation
- StandardScaler normalisation (params saved in `scaler_params.json`)

## Test set performance

| Metric | Score |
|--------|-------|
| Accuracy | 0.6111 |
| Precision | 0.0 |
| Recall | 0.0 |
| F1-Score | 0.0 |

## How to use

```python

import json, torch, numpy as np

from huggingface_hub import hf_hub_download

from transformers import PretrainedConfig, PreTrainedModel



REPO = "Asimzaman19/Fine_Tuning_Dataset"



# Load model

model = TitanicClassifier.from_pretrained(REPO)

model.eval()



# Load scaler params

params_path = hf_hub_download(REPO, "scaler_params.json")

with open(params_path) as f:

    sp = json.load(f)

mean  = np.array(sp["mean"])

scale = np.array(sp["scale"])



# Prepare a sample (must match FEATURES order)

raw = np.array([[3, 1, 22, 1, 0, 7.25, 0, 0, 2, 0, 1, 0, 0]], dtype=np.float32)

scaled = ((raw - mean) / scale).astype(np.float32)



with torch.no_grad():

    logits = model(torch.tensor(scaled)).logits

    pred   = logits.argmax(-1).item()

    prob   = torch.softmax(logits, dim=-1)[0, 1].item()



print(f"Survived: {bool(pred)} (prob={prob:.2%})")

```

## Dataset

The [Titanic dataset](https://www.kaggle.com/competitions/titanic) contains
information about 891 passengers including demographics, ticket class, and
fare — with the binary survival label as target.

## Limitations

- Trained on a small historical dataset (891 rows); performance may not
  generalise beyond the Titanic domain.
- Features are hand-engineered; a more robust pipeline would use automated
  feature selection.

## License

MIT