π³ Credit Default Predictor
A tabular classification model that predicts the probability of a credit card holder defaulting on their next month's payment.
Built as part of an end-to-end ML deployment portfolio project.
π Model Details
| Model type | XGBoost (XGBClassifier) |
| Task | Binary classification β default prediction |
| Dataset | UCI Credit Card Dataset via imodels/credit-card |
| Training samples | 24,000 |
| Test samples | 6,000 |
| ROC-AUC | 0.783 |
| Accuracy | 82% |
π Performance
| Metric | Score |
|---|---|
| ROC-AUC | 0.783 |
| Accuracy | 0.82 |
| Precision (non-default) | 0.84 |
| Recall (non-default) | 0.95 |
| Precision (default) | 0.66 |
| Recall (default) | 0.36 |
π§ Features Used
| Feature | Description |
|---|---|
limit_bal |
Credit limit amount |
age |
Age of the applicant |
pay_0 to pay_6 |
Repayment status for past 6 months |
bill_amt1 to bill_amt6 |
Bill statement amounts |
pay_amt1 to pay_amt6 |
Payment amounts |
sex |
Gender (one-hot encoded) |
education |
Education level (one-hot encoded) |
marriage |
Marital status (one-hot encoded) |
π How to Use
Load and run inference
import joblib
import json
import pandas as pd
from huggingface_hub import hf_hub_download
# Load model and features
model_path = hf_hub_download(repo_id="shrey1905/credit-default-model", filename="model.joblib")
features_path = hf_hub_download(repo_id="shrey1905/credit-default-model", filename="feature_names.json")
model = joblib.load(model_path)
with open(features_path) as f:
feature_names = json.load(f)
# Build input (all unused features set to 0)
input_dict = {f: 0 for f in feature_names}
input_dict["limit_bal"] = 50000
input_dict["age"] = 35
input_dict["pay_0"] = 0
input_dict["bill_amt1"] = 5000
input_dict["pay_amt1"] = 1000
input_dict["sex:1"] = 1
input_dict["education:1"] = 1
input_dict["marriage:1"] = 1
df = pd.DataFrame([input_dict])[feature_names]
prob = model.predict_proba(df)[0][1]
print(f"Default probability: {prob:.1%}")
π₯οΈ Live Demo
Try the model interactively here: π https://huggingface.co/spaces/shrey1905/credit-default-app
β οΈ Limitations
- Trained on Taiwan credit card data from 2005 β may not generalise to other geographies or time periods
- Class imbalance (~22% defaulters) means recall on defaulters is moderate (36%)
- Not intended for real-world credit decisions without further validation
π οΈ Training
from xgboost import XGBClassifier
model = XGBClassifier(
n_estimators=200,
max_depth=4,
learning_rate=0.05,
eval_metric="logloss",
random_state=42
)
π€ Author
Built by @shrey1905