| --- |
| tags: |
| - fraud-detection |
| - credit-card |
| - lightgbm |
| - binary-classification |
| library_name: sklearn |
| --- |
| |
| # Credit Card Fraud Classifier (LightGBM) |
|
|
| ## Model Description |
|
|
| This is a LightGBM-based binary classifier trained to detect credit card fraud transactions. |
|
|
| ## Dataset |
|
|
| - **Source**: ULB/Kaggle Credit Card Fraud Dataset |
| - **Timeframe**: 2 days of transactions |
| - **Positive Rate**: 0.172% (highly imbalanced) |
| - **Features**: Amount + V1-V28 (PCA-transformed features) |
|
|
| ## Model Details |
|
|
| - **Algorithm**: LightGBM Classifier |
| - **Task**: Binary classification (Fraud vs Non-fraud) |
| - **Threshold**: Calibrated to 0.1% FPR (False Positive Rate) cap |
| - **Input Features**: 29 features (Amount + V1 through V28) |
|
|
| ## Usage |
|
|
| ```python |
| import joblib |
| import pandas as pd |
| from huggingface_hub import hf_hub_download |
| |
| # Download model |
| model_path = hf_hub_download(repo_id="yahiaehab10/fraud-ccf-lightgbm", filename="pipeline.pkl") |
| pipeline = joblib.load(model_path) |
| |
| # Download threshold |
| threshold_path = hf_hub_download(repo_id="yahiaehab10/fraud-ccf-lightgbm", filename="threshold.json") |
| import json |
| threshold = json.load(open(threshold_path))["threshold"] |
| |
| # Make predictions |
| # X should have columns: Amount, V1, V2, ..., V28 |
| probabilities = pipeline.predict_proba(X)[:, 1] |
| predictions = (probabilities >= threshold).astype(int) |
| ``` |
|
|
| ## Performance |
|
|
| The model is optimized for fraud detection with a focus on minimizing false positives while maintaining high recall for fraud cases. |
|
|
| ## Limitations |
|
|
| - **Educational purposes only** - Not intended for production use |
| - Trained on historical data - may not generalize to future fraud patterns |
| - Highly imbalanced dataset - requires careful threshold calibration |
|
|
| ## License |
|
|
| Educational use only. Please refer to the original dataset license. |
|
|