metadata
title: Credit Card Fraud Detection App
emoji: π
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Streamlit template space
π³ Credit Card Fraud Detection
Real-time fraud detection using Machine Learning and an interactive Streamlit dashboard.
π Live App
π [HuggingFace Space link]
π Problem
Credit card fraud detection is a highly imbalanced classification problem where fraudulent transactions represent a very small fraction of the data.
The goal is to:
- Detect fraudulent transactions
- Minimize false negatives
- Provide real-time predictions
π Dataset
Source: Kaggle β Credit Card Fraud Detection
Features
The dataset contains:
- Time β seconds since first transaction
- Amount β transaction value
- V1 β V28 β PCA-transformed anonymized features
π Why PCA?
The original transaction data contains sensitive financial information.
To preserve privacy:
- All original features were transformed using Principal Component Analysis (PCA)
- The resulting components are labeled V1βV28
These components:
- Are not directly interpretable
- Capture the underlying transaction patterns
- Retain the information needed for fraud detection
In other words:
V1βV28 are orthogonal principal components representing the variance of the original feature space while ensuring data anonymization.
π§ Model
Baseline model trained using:
- Scaled features
- Train/test split
- ROC-AUC evaluation
Evaluation Metric
ROC-AUC was used because:
- The dataset is highly imbalanced
- Accuracy is misleading
- AUC measures class separability
π― Streamlit App Features
π Prediction
- Manual transaction input
- Random transaction generator
- Fraud probability score
- Adjustable decision threshold
- Downloadable prediction report
π Model Insights
- ROC Curve
- Confusion Matrix
- AUC score
- Feature importance (tree-based models)
βοΈ Tech Stack
- Python
- Scikit-learn
- Streamlit
- NumPy
- Matplotlib
π§ What I Learned
- Handling imbalanced datasets
- Why ROC-AUC is better than accuracy for fraud detection
- Feature scaling impact
- Threshold tuning for business use-cases
- Building ML dashboards for real-time inference
π Future Improvements
- SMOTE / class weighting
- XGBoost / LightGBM
- SHAP explainability
- Real-time API deployment
π€ Author
Beyza Topbas
Machine Learning Portfolio Project