| --- |
| title: Credit Card Fraud Detection App |
| emoji: π |
| colorFrom: red |
| colorTo: red |
| sdk: docker |
| app_port: 8501 |
| tags: |
| - streamlit |
| pinned: false |
| short_description: Streamlit template space |
| --- |
| # π³ Credit Card Fraud Detection |
|
|
| Real-time fraud detection using Machine Learning and an interactive Streamlit dashboard. |
|
|
| ## π Live App |
| π [HuggingFace Space link] |
|
|
| --- |
|
|
| ## π Problem |
|
|
| Credit card fraud detection is a highly imbalanced classification problem where fraudulent transactions represent a very small fraction of the data. |
|
|
| The goal is to: |
|
|
| - Detect fraudulent transactions |
| - Minimize false negatives |
| - Provide real-time predictions |
|
|
| --- |
|
|
| ## π Dataset |
|
|
| Source: Kaggle β Credit Card Fraud Detection |
|
|
| ### Features |
|
|
| The dataset contains: |
|
|
| - **Time** β seconds since first transaction |
| - **Amount** β transaction value |
| - **V1 β V28** β PCA-transformed anonymized features |
|
|
| ### π Why PCA? |
|
|
| The original transaction data contains sensitive financial information. |
|
|
| To preserve privacy: |
|
|
| - All original features were transformed using **Principal Component Analysis (PCA)** |
| - The resulting components are labeled **V1βV28** |
|
|
| These components: |
|
|
| - Are **not directly interpretable** |
| - Capture the **underlying transaction patterns** |
| - Retain the information needed for fraud detection |
|
|
| In other words: |
|
|
| > V1βV28 are orthogonal principal components representing the variance of the original feature space while ensuring data anonymization. |
|
|
| --- |
|
|
| ## π§ Model |
|
|
| Baseline model trained using: |
|
|
| - Scaled features |
| - Train/test split |
| - ROC-AUC evaluation |
|
|
| ### Evaluation Metric |
|
|
| ROC-AUC was used because: |
|
|
| - The dataset is highly imbalanced |
| - Accuracy is misleading |
| - AUC measures class separability |
|
|
| --- |
|
|
| ## π― Streamlit App Features |
|
|
| ### π Prediction |
|
|
| - Manual transaction input |
| - Random transaction generator |
| - Fraud probability score |
| - Adjustable decision threshold |
| - Downloadable prediction report |
|
|
| ### π Model Insights |
|
|
| - ROC Curve |
| - Confusion Matrix |
| - AUC score |
| - Feature importance (tree-based models) |
|
|
| --- |
|
|
| ## βοΈ Tech Stack |
|
|
| - Python |
| - Scikit-learn |
| - Streamlit |
| - NumPy |
| - Matplotlib |
|
|
| --- |
|
|
| ## π§ What I Learned |
|
|
| - Handling imbalanced datasets |
| - Why ROC-AUC is better than accuracy for fraud detection |
| - Feature scaling impact |
| - Threshold tuning for business use-cases |
| - Building ML dashboards for real-time inference |
|
|
| --- |
|
|
| ## π Future Improvements |
|
|
| - SMOTE / class weighting |
| - XGBoost / LightGBM |
| - SHAP explainability |
| - Real-time API deployment |
|
|
| --- |
|
|
| ## π€ Author |
|
|
| Beyza Topbas |
|
|
| Machine Learning Portfolio Project |