testt / README.md
1na37's picture
Update README.md
0ae6d07 verified

A newer version of the Streamlit SDK is available: 1.56.0

Upgrade
metadata
title: SME Credit Risk AI
emoji: ๐Ÿฆ
colorFrom: indigo
colorTo: blue
sdk: streamlit
sdk_version: 1.55.0
app_file: app.py
pinned: true
license: mit

๐Ÿฆ AI-Powered SME Credit Risk Assessment

Final Project โ€” AI Engineering Bootcamp Batch 10 | Author: 1na37

Explainable ยท Prescriptive ยท Multilingual Credit Intelligence


โœจ Features

Feature Description
๐Ÿค– Ensemble ML XGBoost + LightGBM + Random Forest (soft voting, 2:2:1 weights)
๐Ÿ” XAI Engine SHAP Waterfall Plot per applicant โ€” why each decision
๐Ÿ’ฐ Risk Metrics PD Score + Expected Loss (PD ร— LGD ร— EAD, Basel II)
๐ŸŽฎ What-If Sim Interactive slider simulation, real-time score update
๐Ÿ’ฌ LLM Narrative Gemini โ†’ Groq โ†’ Static fallback (auto chain)
๐ŸŒ Multilingual Bahasa Indonesia / English / Hindi

๐Ÿ“Š Dataset

  • Source: German Credit Dataset (UCI ML Repository, id=144)
  • Rows: 1,000 | Original features: 20 | After augmentation: 28
  • Augmentation: Synthetic SME-relevant features added via Python:
    • digital_presence_score, has_social_media, ecommerce_monthly_volume
    • has_npwp, has_siup, business_age_years, monthly_cash_flow, num_employees
  • Class balance: SMOTE applied on training set only (no data leakage)

๐Ÿ—๏ธ Architecture

Input (28 features)
    โ†“
One-Hot Encoding + StandardScaler + SMOTE (train only)
    โ†“
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚  Soft Voting Ensemble (2:2:1)       โ”‚
โ”‚  โ”œโ”€โ”€ XGBoost  (n=300, lr=0.05)     โ”‚
โ”‚  โ”œโ”€โ”€ LightGBM (n=300, lr=0.05)     โ”‚
โ”‚  โ””โ”€โ”€ RandomForest (n=200)           โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
    โ†“
PD Score โ†’ EL = PD ร— LGD(40%) ร— EAD
    โ†“
SHAP TreeExplainer โ†’ Waterfall Plot
    โ†“
LLM Narrative (Gemini โ†’ Groq โ†’ Fallback)

๐Ÿ“ˆ Evaluation

  • ROC-AUC โ€” primary metric (industry target: >0.75)
  • KS Statistic โ€” separation between good/bad credit distributions
  • F1-Score โ€” balanced precision/recall for imbalanced data
  • 5-Fold Cross-Validation โ€” SMOTE inside ImbPipeline (no leakage) "Model menunjukkan performa sempurna karena synthetic features dirancang berkorelasi kuat dengan target, sebagai simulasi kondisi ideal. Dalam deployment nyata, fitur-fitur ini akan diganti dengan data aktual dari sumber eksternal seperti API e-commerce, data perpajakan, dan rekening koran." Evaluation Report

๐Ÿš€ Run Locally

git clone https://huggingface.co/spaces/1na37/sme-credit-risk
cd sme-credit-risk
pip install -r requirements.txt

# Train model first (run train_colab.py in Google Colab)
# Download model_export.zip, extract to model/ folder

streamlit run app.py

๐Ÿ”‘ API Keys (Optional)

In HF Spaces โ†’ Settings โ†’ Repository Secrets:

App works without API keys using static fallback template.


๐Ÿ“š References


Built with โค๏ธ for AI Engineering Bootcamp Batch 10