A newer version of the Streamlit SDK is available: 1.56.0
metadata
title: SME Credit Risk AI
emoji: ๐ฆ
colorFrom: indigo
colorTo: blue
sdk: streamlit
sdk_version: 1.55.0
app_file: app.py
pinned: true
license: mit
๐ฆ AI-Powered SME Credit Risk Assessment
Final Project โ AI Engineering Bootcamp Batch 10 | Author: 1na37
Explainable ยท Prescriptive ยท Multilingual Credit Intelligence
โจ Features
| Feature | Description |
|---|---|
| ๐ค Ensemble ML | XGBoost + LightGBM + Random Forest (soft voting, 2:2:1 weights) |
| ๐ XAI Engine | SHAP Waterfall Plot per applicant โ why each decision |
| ๐ฐ Risk Metrics | PD Score + Expected Loss (PD ร LGD ร EAD, Basel II) |
| ๐ฎ What-If Sim | Interactive slider simulation, real-time score update |
| ๐ฌ LLM Narrative | Gemini โ Groq โ Static fallback (auto chain) |
| ๐ Multilingual | Bahasa Indonesia / English / Hindi |
๐ Dataset
- Source: German Credit Dataset (UCI ML Repository, id=144)
- Rows: 1,000 | Original features: 20 | After augmentation: 28
- Augmentation: Synthetic SME-relevant features added via Python:
digital_presence_score,has_social_media,ecommerce_monthly_volumehas_npwp,has_siup,business_age_years,monthly_cash_flow,num_employees
- Class balance: SMOTE applied on training set only (no data leakage)
๐๏ธ Architecture
Input (28 features)
โ
One-Hot Encoding + StandardScaler + SMOTE (train only)
โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Soft Voting Ensemble (2:2:1) โ
โ โโโ XGBoost (n=300, lr=0.05) โ
โ โโโ LightGBM (n=300, lr=0.05) โ
โ โโโ RandomForest (n=200) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
PD Score โ EL = PD ร LGD(40%) ร EAD
โ
SHAP TreeExplainer โ Waterfall Plot
โ
LLM Narrative (Gemini โ Groq โ Fallback)
๐ Evaluation
- ROC-AUC โ primary metric (industry target: >0.75)
- KS Statistic โ separation between good/bad credit distributions
- F1-Score โ balanced precision/recall for imbalanced data
- 5-Fold Cross-Validation โ SMOTE inside ImbPipeline (no leakage)
"Model menunjukkan performa sempurna karena synthetic features dirancang berkorelasi kuat dengan target,
sebagai simulasi kondisi ideal. Dalam deployment nyata,
fitur-fitur ini akan diganti dengan data aktual dari sumber eksternal
seperti API e-commerce, data perpajakan, dan rekening koran."

๐ Run Locally
git clone https://huggingface.co/spaces/1na37/sme-credit-risk
cd sme-credit-risk
pip install -r requirements.txt
# Train model first (run train_colab.py in Google Colab)
# Download model_export.zip, extract to model/ folder
streamlit run app.py
๐ API Keys (Optional)
In HF Spaces โ Settings โ Repository Secrets:
GEMINI_API_KEYโ Google AI Studio (free tier)GROQ_API_KEYโ console.groq.com (free tier)
App works without API keys using static fallback template.
๐ References
- German Credit Dataset โ UCI ML Repository
- SHAP Library
- Basel II Credit Risk Framework (LGD = 40% standard)
Built with โค๏ธ for AI Engineering Bootcamp Batch 10