Asimzaman19 commited on
Commit
d60dc2d
·
verified ·
1 Parent(s): 1a4d07c

Add fine-tuned Titanic classifier with model card

Browse files
Files changed (6) hide show
  1. README.md +130 -0
  2. config.json +11 -0
  3. feature_names.json +1 -0
  4. model.safetensors +3 -0
  5. scaler_params.json +47 -0
  6. test_metrics.json +6 -0
README.md ADDED
@@ -0,0 +1,130 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: mit
4
+ tags:
5
+ - classification
6
+ - tabular
7
+ - titanic
8
+ - survival-prediction
9
+ - pytorch
10
+ datasets:
11
+ - titanic
12
+ metrics:
13
+ - accuracy
14
+ - f1
15
+ - precision
16
+ - recall
17
+ model-index:
18
+ - name: Fine_Tuning_Dataset
19
+ results:
20
+ - task:
21
+ type: tabular-classification
22
+ dataset:
23
+ name: Titanic
24
+ type: titanic
25
+ metrics:
26
+ - type: accuracy
27
+ value: 0.6111
28
+ - type: f1
29
+ value: 0.0
30
+ - type: precision
31
+ value: 0.0
32
+ - type: recall
33
+ value: 0.0
34
+ ---
35
+
36
+ # 🚢 Titanic Survival Classifier
37
+
38
+ A lightweight MLP classifier wrapped in the Hugging Face `PreTrainedModel` interface,
39
+ trained to predict passenger survival on the Titanic dataset.
40
+
41
+ ## Model description
42
+
43
+ | Component | Detail |
44
+ |-----------|--------|
45
+ | Architecture | 4-layer MLP with BatchNorm, GELU, Dropout |
46
+ | Hidden dim | 128 |
47
+ | Input features | 13 engineered tabular features |
48
+ | Output | Binary (survived / not survived) |
49
+ | Parameters | ~12,578 |
50
+
51
+ ## Training details
52
+
53
+ | Setting | Value |
54
+ |---------|-------|
55
+ | Optimizer | AdamW |
56
+ | Learning rate | 0.001 |
57
+ | Scheduler | Cosine annealing |
58
+ | Epochs | 30 |
59
+ | Batch size | 32 |
60
+ | Train / Val / Test split | 80 / 10 / 10 % |
61
+
62
+ ## Feature engineering
63
+
64
+ Features used: `Pclass, Sex, Age, SibSp, Parch, Fare, Embarked, HasCabin, FamilySize, IsAlone, AgeBand, FareBand, Title`
65
+
66
+ Key transformations applied:
67
+ - **Title extraction** from passenger names (Mr, Mrs, Miss, Master, Rare)
68
+ - **Age imputation** using median per title group
69
+ - **FamilySize** = SibSp + Parch + 1; **IsAlone** flag
70
+ - **HasCabin** binary flag
71
+ - **AgeBand** and **FareBand** discretisation
72
+ - StandardScaler normalisation (params saved in `scaler_params.json`)
73
+
74
+ ## Test set performance
75
+
76
+ | Metric | Score |
77
+ |--------|-------|
78
+ | Accuracy | 0.6111 |
79
+ | Precision | 0.0 |
80
+ | Recall | 0.0 |
81
+ | F1-Score | 0.0 |
82
+
83
+ ## How to use
84
+
85
+ ```python
86
+ import json, torch, numpy as np
87
+ from huggingface_hub import hf_hub_download
88
+ from transformers import PretrainedConfig, PreTrainedModel
89
+
90
+ REPO = "Asimzaman19/Fine_Tuning_Dataset"
91
+
92
+ # Load model
93
+ model = TitanicClassifier.from_pretrained(REPO)
94
+ model.eval()
95
+
96
+ # Load scaler params
97
+ params_path = hf_hub_download(REPO, "scaler_params.json")
98
+ with open(params_path) as f:
99
+ sp = json.load(f)
100
+ mean = np.array(sp["mean"])
101
+ scale = np.array(sp["scale"])
102
+
103
+ # Prepare a sample (must match FEATURES order)
104
+ raw = np.array([[3, 1, 22, 1, 0, 7.25, 0, 0, 2, 0, 1, 0, 0]], dtype=np.float32)
105
+ scaled = ((raw - mean) / scale).astype(np.float32)
106
+
107
+ with torch.no_grad():
108
+ logits = model(torch.tensor(scaled)).logits
109
+ pred = logits.argmax(-1).item()
110
+ prob = torch.softmax(logits, dim=-1)[0, 1].item()
111
+
112
+ print(f"Survived: {bool(pred)} (prob={prob:.2%})")
113
+ ```
114
+
115
+ ## Dataset
116
+
117
+ The [Titanic dataset](https://www.kaggle.com/competitions/titanic) contains
118
+ information about 891 passengers including demographics, ticket class, and
119
+ fare — with the binary survival label as target.
120
+
121
+ ## Limitations
122
+
123
+ - Trained on a small historical dataset (891 rows); performance may not
124
+ generalise beyond the Titanic domain.
125
+ - Features are hand-engineered; a more robust pipeline would use automated
126
+ feature selection.
127
+
128
+ ## License
129
+
130
+ MIT
config.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "TitanicClassifier"
4
+ ],
5
+ "dropout": 0.3,
6
+ "dtype": "float32",
7
+ "hidden_dim": 128,
8
+ "input_dim": 13,
9
+ "model_type": "titanic_mlp",
10
+ "transformers_version": "5.5.1"
11
+ }
feature_names.json ADDED
@@ -0,0 +1 @@
 
 
1
+ ["Pclass", "Sex", "Age", "SibSp", "Parch", "Fare", "Embarked", "HasCabin", "FamilySize", "IsAlone", "AgeBand", "FareBand", "Title"]
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c76448052b452a154f85d63b3d33839ba4cd0b4e132956706adde0cc45550447
3
+ size 53232
scaler_params.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "mean": [
3
+ 2.3047752808988764,
4
+ 0.3553370786516854,
5
+ 29.256558988745628,
6
+ 0.526685393258427,
7
+ 0.40308988764044945,
8
+ 32.95399021968413,
9
+ 0.376056338028169,
10
+ 0.23174157303370788,
11
+ 1.9297752808988764,
12
+ 0.601123595505618,
13
+ 2.0168539325842696,
14
+ 1.5112359550561798,
15
+ 0.75
16
+ ],
17
+ "scale": [
18
+ 0.8370491944896091,
19
+ 0.4786153353027576,
20
+ 13.464659259524808,
21
+ 1.1202258108590086,
22
+ 0.8238996767285812,
23
+ 49.28614526814358,
24
+ 0.6419523014908564,
25
+ 0.4219448025056973,
26
+ 1.635494961693865,
27
+ 0.48966725276662787,
28
+ 0.8609814043661498,
29
+ 1.1267371808289228,
30
+ 1.0436404520994895
31
+ ],
32
+ "feature_names": [
33
+ "Pclass",
34
+ "Sex",
35
+ "Age",
36
+ "SibSp",
37
+ "Parch",
38
+ "Fare",
39
+ "Embarked",
40
+ "HasCabin",
41
+ "FamilySize",
42
+ "IsAlone",
43
+ "AgeBand",
44
+ "FareBand",
45
+ "Title"
46
+ ]
47
+ }
test_metrics.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "accuracy": 0.6111,
3
+ "precision": 0.0,
4
+ "recall": 0.0,
5
+ "f1": 0.0
6
+ }