MSME Payment Outcome Predictor (LightGBM)
Overview
This model predicts the probabilistic outcome of MSME payment disputes:
- Win
- Settlement
- Escalation to MSEFC
The model outputs calibrated probabilities for each outcome.
Model Architecture
- Algorithm: LightGBM (Gradient Boosted Decision Trees)
- Calibration: Isotonic Regression (
CalibratedClassifierCV) - Preprocessing:
- OneHotEncoding (categorical features)
- Numeric features passthrough
- Class balancing enabled
Input Features
| Feature | Type |
|---|---|
| claim_amount | float |
| delay_days | float |
| buyer_type | categorical (govt/private) |
| contract_present | binary |
| industry_sector | categorical |
| claim_imputed | binary |
| delay_imputed | binary |
Output Format
{
"predicted_label": "win",
"probabilities": {
"win": 0.59,
"settlement": 0.05,
"escalation": 0.35
}
}
Performance Metrics
- Primary metric: AUC-ROC (macro) โ 0.72
- Balanced Accuracy โ 0.63
- F1 Macro โ 0.61
Intended Use
- Legal risk scoring
- MSME advisory tools
- Research prototype
- Decision support systems
Limitations
- Based on structured extracted data only
- Does not include full legal document text
- Not intended for judicial automation
Evaluation results
- AUC on MSME Payment Dispute Datasetself-reported0.720
- F1 Score on MSME Payment Dispute Datasetself-reported0.610
- Balanced Accuracy on MSME Payment Dispute Datasetself-reported0.630