--- tags: - fraud-detection - credit-card - lightgbm - binary-classification library_name: sklearn --- # Credit Card Fraud Classifier (LightGBM) ## Model Description This is a LightGBM-based binary classifier trained to detect credit card fraud transactions. ## Dataset - **Source**: ULB/Kaggle Credit Card Fraud Dataset - **Timeframe**: 2 days of transactions - **Positive Rate**: 0.172% (highly imbalanced) - **Features**: Amount + V1-V28 (PCA-transformed features) ## Model Details - **Algorithm**: LightGBM Classifier - **Task**: Binary classification (Fraud vs Non-fraud) - **Threshold**: Calibrated to 0.1% FPR (False Positive Rate) cap - **Input Features**: 29 features (Amount + V1 through V28) ## Usage ```python import joblib import pandas as pd from huggingface_hub import hf_hub_download # Download model model_path = hf_hub_download(repo_id="yahiaehab10/fraud-ccf-lightgbm", filename="pipeline.pkl") pipeline = joblib.load(model_path) # Download threshold threshold_path = hf_hub_download(repo_id="yahiaehab10/fraud-ccf-lightgbm", filename="threshold.json") import json threshold = json.load(open(threshold_path))["threshold"] # Make predictions # X should have columns: Amount, V1, V2, ..., V28 probabilities = pipeline.predict_proba(X)[:, 1] predictions = (probabilities >= threshold).astype(int) ``` ## Performance The model is optimized for fraud detection with a focus on minimizing false positives while maintaining high recall for fraud cases. ## Limitations - **Educational purposes only** - Not intended for production use - Trained on historical data - may not generalize to future fraud patterns - Highly imbalanced dataset - requires careful threshold calibration ## License Educational use only. Please refer to the original dataset license.