Ikimina Digital Trust & Reliability Index

A lightweight scikit-learn + LightGBM pipeline that turns an Ikimina member's 12-month contribution history into a Reliability Index in [0, 100], designed for deployment over USSD on a feature phone in rural Rwanda.

Built for the AIMS KTT Fellowship Hackathon 2026 β€” Challenge T1.1.

  • Repo: [LINK](https://github.com/Ahmed-5/AIMS_KTT)
  • 4-minute video: <YOUTUBE-UNLISTED-URL-HERE>

Intended use

  • Primary use case: surface a coarse reliability tier (low_risk / watch / high_risk) that an MFI loan officer can use as a conversation starter with an Ikimina member.
  • Secondary use case: trigger a Kinyarwanda / French SMS back to the Ikimina secretary after a short USSD session (*654*MEMBER_ID#, ~20 RWF per query).

Out of scope

  • ❌ Not a credit bureau replacement. The index must not be the only signal in a loan decision.
  • ❌ Not calibrated for non-Ikimina data. The feature math assumes a 12-month weekly-contribution context; behaviour on any other substrate is undefined.
  • ❌ Not a deep-identity system. The USSD path transmits only a member_id pseudonym over SS7 β€” no PII is embedded in the model input.

How the score is produced

  1. Read the member's monthly CSV record and their group's CSV record.
  2. Compute 12 engineered features (see src/features.py::FEATURE_COLS): mean_on_time, on_time_volatility, recency_weighted_miss, max_on_time_streak, total_missed, penalty_paid_per_miss, borrow_to_repay, loan_burden, role_seniority, tenure_months, contrib_group_zscore, urban_flag.
  3. Standard-scale β†’ calibrated Logistic Regression β†’ P(default_within_6m).
  4. score = round(100 Β· (1 βˆ’ P(default))) clipped to [0, 100].
  5. Tiers (from the challenge brief): 0–40 high_risk Β· 41–70 watch Β· 71–100 low_risk.
  6. Optional: blend with a group reliability index (stretch goal) using alpha=0.2.

Training data

Fully synthetic β€” generated by src/generate_data.py with seed 42, following the brief's recipe line-by-line:

  • 500 members Γ— 12 months Γ— 40 groups.
  • Monthly missed contribution: Bernoulli(p = base_miss), with base_miss ~ Beta(2, 20) per member and AR(1) correlation ρ = 0.4 across months.
  • Penalties: 50 % of miss-months get one, 70 % paid next month.
  • Borrowing: ~30 % of members; LogNormal(mean = 3 Β· weekly Β· 8).
  • Default label: logistic on (missed_total, unpaid_penalties, borrow/repay, tenure) with intercept bisected to ~14 % positive rate.
  • Train / test split: last 100 member_ids are the holdout.

No real member data was used.


Evaluation (deterministic, seed 42)

Metric Value
Holdout ROC-AUC 0.944
Holdout Brier score 0.056
CV AUC β€” Logistic Regression 0.854
CV AUC β€” LightGBM 0.828
Chosen model Logistic Regression + calibration
Holdout positive rate 0.10
Tier mix on holdout high_risk: 2 Β· watch: 9 Β· low_risk: 89

Charts (in reports/): roc_curve.png, calibration_curve.png, feature_importance.png, score_distribution.png, district_heatmap.png.

Why Logistic Regression over LightGBM?

LightGBM's CV AUC was only 0.03 lower than LR. On N = 400 training rows and a regulator audience that needs to read (and argue with) the coefficients, the interpretability of a calibrated linear model outranks a 3-point AUC gap.


Limitations & known failure modes

  • Skewed tier mix. Because the underlying default rate is ~10 %, most members land in low_risk. The brief fixes the tier cut-offs; we do not re-quantile them. Downstream product design (see ussd_flow.md) fails toward watch, never low_risk, whenever anything is uncertain.
  • Thin histories. Members with < 6 observed months are capped at the watch tier via scorer.py::score_shadow(), and the system returns a widened 80 % band.
  • Synthetic drift. The model is trained only on synthetic data generated from the brief's recipe. Deploying against real Ikimina records will require retraining and revalidation. The generator is shipped in-repo precisely so this retraining is one command away.
  • SS7 exposure. The USSD carrier layer is not encrypted. We minimise blast radius by sending only the member_id pseudonym over the wire; see ussd_flow.md for the full privacy trade-off.

Files in this model release

File Purpose
model.pkl joblib-pickled dict: {model, scaler, uses_scaler, feature_cols, chosen_model, cv_auc, holdout}. β‰ˆ 5 KB.
group_reliability_index.csv Per-group aggregate reliability index used by the blending stretch goal.
metrics.json Holdout AUC, Brier, CV AUC for both candidate models, tier mix.
Chart PNGs ROC, calibration, feature importance, score distribution, district heatmap.

Ethics & consent

  • The USSD flow has a hard consent gate on Screen 1. No score is computed until the secretary confirms.
  • Consent log is retained 18 months, then purged.
  • Members can query their own query-log and revoke future queries at no cost via *654*0#.
  • No PII (name, phone, district) ever enters the model's input or crosses SS7.

How to reproduce

pip install -r requirements.txt
python src/generate_data.py && python src/train_model.py && python scorer.py --member 412 --group 07

Total wall-clock on a free Colab CPU: < 2 min.


Citation

If you use this model or the synthetic generator, please cite:

Ikimina Digital Trust & Reliability Index β€” AIMS KTT Fellowship Hackathon 2026, Challenge T1.1. MIT licence.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support