BeyzaTopbas's picture
Update README.md
85783a4 verified
---
title: Credit Card Fraud Detection App
emoji: πŸš€
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
- streamlit
pinned: false
short_description: Streamlit template space
---
# πŸ’³ Credit Card Fraud Detection
Real-time fraud detection using Machine Learning and an interactive Streamlit dashboard.
## πŸš€ Live App
πŸ‘‰ [HuggingFace Space link]
---
## πŸ“Œ Problem
Credit card fraud detection is a highly imbalanced classification problem where fraudulent transactions represent a very small fraction of the data.
The goal is to:
- Detect fraudulent transactions
- Minimize false negatives
- Provide real-time predictions
---
## πŸ“Š Dataset
Source: Kaggle – Credit Card Fraud Detection
### Features
The dataset contains:
- **Time** β†’ seconds since first transaction
- **Amount** β†’ transaction value
- **V1 – V28** β†’ PCA-transformed anonymized features
### πŸ” Why PCA?
The original transaction data contains sensitive financial information.
To preserve privacy:
- All original features were transformed using **Principal Component Analysis (PCA)**
- The resulting components are labeled **V1–V28**
These components:
- Are **not directly interpretable**
- Capture the **underlying transaction patterns**
- Retain the information needed for fraud detection
In other words:
> V1–V28 are orthogonal principal components representing the variance of the original feature space while ensuring data anonymization.
---
## 🧠 Model
Baseline model trained using:
- Scaled features
- Train/test split
- ROC-AUC evaluation
### Evaluation Metric
ROC-AUC was used because:
- The dataset is highly imbalanced
- Accuracy is misleading
- AUC measures class separability
---
## 🎯 Streamlit App Features
### πŸ” Prediction
- Manual transaction input
- Random transaction generator
- Fraud probability score
- Adjustable decision threshold
- Downloadable prediction report
### πŸ“Š Model Insights
- ROC Curve
- Confusion Matrix
- AUC score
- Feature importance (tree-based models)
---
## βš™οΈ Tech Stack
- Python
- Scikit-learn
- Streamlit
- NumPy
- Matplotlib
---
## 🧠 What I Learned
- Handling imbalanced datasets
- Why ROC-AUC is better than accuracy for fraud detection
- Feature scaling impact
- Threshold tuning for business use-cases
- Building ML dashboards for real-time inference
---
## πŸš€ Future Improvements
- SMOTE / class weighting
- XGBoost / LightGBM
- SHAP explainability
- Real-time API deployment
---
## πŸ‘€ Author
Beyza Topbas
Machine Learning Portfolio Project