Add fine-tuned Titanic classifier with model card

Browse files

Files changed (6) hide show

README.md +130 -0
config.json +11 -0
feature_names.json +1 -0
model.safetensors +3 -0
scaler_params.json +47 -0
test_metrics.json +6 -0

README.md ADDED Viewed

	@@ -0,0 +1,130 @@

+---
+language: en
+license: mit
+tags:
+  - classification
+  - tabular
+  - titanic
+  - survival-prediction
+  - pytorch
+datasets:
+  - titanic
+metrics:
+  - accuracy
+  - f1
+  - precision
+  - recall
+model-index:
+  - name: Fine_Tuning_Dataset
+    results:
+      - task:
+          type: tabular-classification
+        dataset:
+          name: Titanic
+          type: titanic
+        metrics:
+          - type: accuracy
+            value: 0.6111
+          - type: f1
+            value: 0.0
+          - type: precision
+            value: 0.0
+          - type: recall
+            value: 0.0
+---
+# 🚢 Titanic Survival Classifier
+A lightweight MLP classifier wrapped in the Hugging Face `PreTrainedModel` interface,
+trained to predict passenger survival on the Titanic dataset.
+## Model description
+| Component | Detail |
+|-----------|--------|
+| Architecture | 4-layer MLP with BatchNorm, GELU, Dropout |
+| Hidden dim | 128 |
+| Input features | 13 engineered tabular features |
+| Output | Binary (survived / not survived) |
+| Parameters | ~12,578 |
+## Training details
+| Setting | Value |
+|---------|-------|
+| Optimizer | AdamW |
+| Learning rate | 0.001 |
+| Scheduler | Cosine annealing |
+| Epochs | 30 |
+| Batch size | 32 |
+| Train / Val / Test split | 80 / 10 / 10 % |
+## Feature engineering
+Features used: `Pclass, Sex, Age, SibSp, Parch, Fare, Embarked, HasCabin, FamilySize, IsAlone, AgeBand, FareBand, Title`
+Key transformations applied:
+- **Title extraction** from passenger names (Mr, Mrs, Miss, Master, Rare)
+- **Age imputation** using median per title group
+- **FamilySize** = SibSp + Parch + 1; **IsAlone** flag
+- **HasCabin** binary flag
+- **AgeBand** and **FareBand** discretisation
+- StandardScaler normalisation (params saved in `scaler_params.json`)
+## Test set performance
+| Metric | Score |
+|--------|-------|
+| Accuracy | 0.6111 |
+| Precision | 0.0 |
+| Recall | 0.0 |
+| F1-Score | 0.0 |
+## How to use
+```python
+import json, torch, numpy as np
+from huggingface_hub import hf_hub_download
+from transformers import PretrainedConfig, PreTrainedModel
+REPO = "Asimzaman19/Fine_Tuning_Dataset"
+# Load model
+model = TitanicClassifier.from_pretrained(REPO)
+model.eval()
+# Load scaler params
+params_path = hf_hub_download(REPO, "scaler_params.json")
+with open(params_path) as f:
+    sp = json.load(f)
+mean  = np.array(sp["mean"])
+scale = np.array(sp["scale"])
+# Prepare a sample (must match FEATURES order)
+raw = np.array([[3, 1, 22, 1, 0, 7.25, 0, 0, 2, 0, 1, 0, 0]], dtype=np.float32)
+scaled = ((raw - mean) / scale).astype(np.float32)
+with torch.no_grad():
+    logits = model(torch.tensor(scaled)).logits
+    pred   = logits.argmax(-1).item()
+    prob   = torch.softmax(logits, dim=-1)[0, 1].item()
+print(f"Survived: {bool(pred)} (prob={prob:.2%})")
+```
+## Dataset
+The [Titanic dataset](https://www.kaggle.com/competitions/titanic) contains
+information about 891 passengers including demographics, ticket class, and
+fare — with the binary survival label as target.
+## Limitations
+- Trained on a small historical dataset (891 rows); performance may not
+  generalise beyond the Titanic domain.
+- Features are hand-engineered; a more robust pipeline would use automated
+  feature selection.
+## License
+MIT

config.json ADDED Viewed

	@@ -0,0 +1,11 @@

+{
+  "architectures": [
+    "TitanicClassifier"
+  ],
+  "dropout": 0.3,
+  "dtype": "float32",
+  "hidden_dim": 128,
+  "input_dim": 13,
+  "model_type": "titanic_mlp",
+  "transformers_version": "5.5.1"
+}

feature_names.json ADDED Viewed

	@@ -0,0 +1 @@


1	+ ["Pclass", "Sex", "Age", "SibSp", "Parch", "Fare", "Embarked", "HasCabin", "FamilySize", "IsAlone", "AgeBand", "FareBand", "Title"]

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:c76448052b452a154f85d63b3d33839ba4cd0b4e132956706adde0cc45550447
+size 53232

scaler_params.json ADDED Viewed

	@@ -0,0 +1,47 @@

+{
+  "mean": [
+    2.3047752808988764,
+    0.3553370786516854,
+    29.256558988745628,
+    0.526685393258427,
+    0.40308988764044945,
+    32.95399021968413,
+    0.376056338028169,
+    0.23174157303370788,
+    1.9297752808988764,
+    0.601123595505618,
+    2.0168539325842696,
+    1.5112359550561798,
+    0.75
+  ],
+  "scale": [
+    0.8370491944896091,
+    0.4786153353027576,
+    13.464659259524808,
+    1.1202258108590086,
+    0.8238996767285812,
+    49.28614526814358,
+    0.6419523014908564,
+    0.4219448025056973,
+    1.635494961693865,
+    0.48966725276662787,
+    0.8609814043661498,
+    1.1267371808289228,
+    1.0436404520994895
+  ],
+  "feature_names": [
+    "Pclass",
+    "Sex",
+    "Age",
+    "SibSp",
+    "Parch",
+    "Fare",
+    "Embarked",
+    "HasCabin",
+    "FamilySize",
+    "IsAlone",
+    "AgeBand",
+    "FareBand",
+    "Title"
+  ]
+}

test_metrics.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "accuracy": 0.6111,
+  "precision": 0.0,
+  "recall": 0.0,
+  "f1": 0.0
+}