Spaces:

BeyzaTopbas
/

Credit_Card_Fraud_Detection_App

Sleeping

App Files Files Community

Credit_Card_Fraud_Detection_App / README.md

BeyzaTopbas

Update README.md

85783a4 verified about 2 months ago

preview code

raw

history blame contribute delete

2.56 kB

metadata

title: Credit Card Fraud Detection App
emoji: 🚀
colorFrom: red
colorTo: red
sdk: docker
app_port: 8501
tags:
  - streamlit
pinned: false
short_description: Streamlit template space

💳 Credit Card Fraud Detection

Real-time fraud detection using Machine Learning and an interactive Streamlit dashboard.

🚀 Live App

👉 [HuggingFace Space link]

📌 Problem

Credit card fraud detection is a highly imbalanced classification problem where fraudulent transactions represent a very small fraction of the data.

The goal is to:

Detect fraudulent transactions
Minimize false negatives
Provide real-time predictions

📊 Dataset

Source: Kaggle – Credit Card Fraud Detection

Features

The dataset contains:

Time → seconds since first transaction
Amount → transaction value
V1 – V28 → PCA-transformed anonymized features

🔐 Why PCA?

The original transaction data contains sensitive financial information.

To preserve privacy:

All original features were transformed using Principal Component Analysis (PCA)
The resulting components are labeled V1–V28

These components:

Are not directly interpretable
Capture the underlying transaction patterns
Retain the information needed for fraud detection

In other words:

V1–V28 are orthogonal principal components representing the variance of the original feature space while ensuring data anonymization.

🧠 Model

Baseline model trained using:

Scaled features
Train/test split
ROC-AUC evaluation

Evaluation Metric

ROC-AUC was used because:

The dataset is highly imbalanced
Accuracy is misleading
AUC measures class separability

🎯 Streamlit App Features

🔍 Prediction

Manual transaction input
Random transaction generator
Fraud probability score
Adjustable decision threshold
Downloadable prediction report

📊 Model Insights

ROC Curve
Confusion Matrix
AUC score
Feature importance (tree-based models)

⚙️ Tech Stack

Python
Scikit-learn
Streamlit
NumPy
Matplotlib

🧠 What I Learned

Handling imbalanced datasets
Why ROC-AUC is better than accuracy for fraud detection
Feature scaling impact
Threshold tuning for business use-cases
Building ML dashboards for real-time inference

🚀 Future Improvements

SMOTE / class weighting
XGBoost / LightGBM
SHAP explainability
Real-time API deployment

👤 Author

Beyza Topbas

Machine Learning Portfolio Project