Asimzaman19
/

Fine_Tuning_Dataset

survival-prediction

Eval Results (legacy)

Model card Files Files and versions

Fine_Tuning_Dataset / README.md

Asimzaman19's picture

Add fine-tuned Titanic classifier with model card

d60dc2d verified about 2 months ago

|

history blame contribute delete

3.36 kB

	---
	language: en
	license: mit
	tags:
	- classification
	- tabular
	- titanic
	- survival-prediction
	- pytorch
	datasets:
	- titanic
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	model-index:
	- name: Fine_Tuning_Dataset
	results:
	- task:
	type: tabular-classification
	dataset:
	name: Titanic
	type: titanic
	metrics:
	- type: accuracy
	value: 0.6111
	- type: f1
	value: 0.0
	- type: precision
	value: 0.0
	- type: recall
	value: 0.0
	---

	# 🚢 Titanic Survival Classifier

	A lightweight MLP classifier wrapped in the Hugging Face `PreTrainedModel` interface,
	trained to predict passenger survival on the Titanic dataset.

	## Model description

	\| Component \| Detail \|
	\|-----------\|--------\|
	\| Architecture \| 4-layer MLP with BatchNorm, GELU, Dropout \|
	\| Hidden dim \| 128 \|
	\| Input features \| 13 engineered tabular features \|
	\| Output \| Binary (survived / not survived) \|
	\| Parameters \| ~12,578 \|

	## Training details

	\| Setting \| Value \|
	\|---------\|-------\|
	\| Optimizer \| AdamW \|
	\| Learning rate \| 0.001 \|
	\| Scheduler \| Cosine annealing \|
	\| Epochs \| 30 \|
	\| Batch size \| 32 \|
	\| Train / Val / Test split \| 80 / 10 / 10 % \|

	## Feature engineering

	Features used: `Pclass, Sex, Age, SibSp, Parch, Fare, Embarked, HasCabin, FamilySize, IsAlone, AgeBand, FareBand, Title`

	Key transformations applied:
	- Title extraction from passenger names (Mr, Mrs, Miss, Master, Rare)
	- Age imputation using median per title group
	- FamilySize = SibSp + Parch + 1; IsAlone flag
	- HasCabin binary flag
	- AgeBand and FareBand discretisation
	- StandardScaler normalisation (params saved in `scaler_params.json`)

	## Test set performance

	\| Metric \| Score \|
	\|--------\|-------\|
	\| Accuracy \| 0.6111 \|
	\| Precision \| 0.0 \|
	\| Recall \| 0.0 \|
	\| F1-Score \| 0.0 \|

	## How to use

	```python
	import json, torch, numpy as np
	from huggingface_hub import hf_hub_download
	from transformers import PretrainedConfig, PreTrainedModel

	REPO = "Asimzaman19/Fine_Tuning_Dataset"

	# Load model
	model = TitanicClassifier.from_pretrained(REPO)
	model.eval()

	# Load scaler params
	params_path = hf_hub_download(REPO, "scaler_params.json")
	with open(params_path) as f:
	sp = json.load(f)
	mean = np.array(sp["mean"])
	scale = np.array(sp["scale"])

	# Prepare a sample (must match FEATURES order)
	raw = np.array([[3, 1, 22, 1, 0, 7.25, 0, 0, 2, 0, 1, 0, 0]], dtype=np.float32)
	scaled = ((raw - mean) / scale).astype(np.float32)

	with torch.no_grad():
	logits = model(torch.tensor(scaled)).logits
	pred = logits.argmax(-1).item()
	prob = torch.softmax(logits, dim=-1)[0, 1].item()

	print(f"Survived: {bool(pred)} (prob={prob:.2%})")
	```

	## Dataset

	The [Titanic dataset](https://www.kaggle.com/competitions/titanic) contains
	information about 891 passengers including demographics, ticket class, and
	fare — with the binary survival label as target.

	## Limitations

	- Trained on a small historical dataset (891 rows); performance may not
	generalise beyond the Titanic domain.
	- Features are hand-engineered; a more robust pipeline would use automated
	feature selection.

	## License

	MIT