Synav commited on
Commit
c4bef3b
Β·
verified Β·
1 Parent(s): f66ed97

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +68 -6
README.md CHANGED
@@ -1,9 +1,71 @@
1
  ---
2
  license: apache-2.0
3
  ---
4
- LogiSHAP-Studio-LogReg/
5
- β”‚
6
- β”œβ”€β”€ model.joblib # sklearn pipeline (preprocess + logistic regression)
7
- β”œβ”€β”€ meta.json # metrics, feature types, label mapping
8
- β”œβ”€β”€ requirements.txt # optional, for reproducibility
9
- └── README.md # model card
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # ExplainML Studio – Logistic Regression Models
5
+
6
+ This repository hosts **versioned, trained machine learning models** produced using the **ExplainML Studio** framework.
7
+ The current releases implement **logistic regression pipelines with full explainability and clinical evaluation artifacts**.
8
+
9
+ These models are designed for **transparent, auditable, and clinically interpretable binary classification tasks**.
10
+
11
+ ---
12
+
13
+ ## Model Overview
14
+
15
+ - **Framework:** ExplainML Studio
16
+ - **Algorithm:** Logistic Regression (scikit-learn)
17
+ - **Pipeline:**
18
+ - Numeric features β†’ median imputation + standard scaling
19
+ - Categorical features β†’ most-frequent imputation + one-hot encoding
20
+ - **Explainability:** SHAP (LinearExplainer)
21
+ - **Output:** Predicted probability (0–1)
22
+
23
+ Each model is packaged as a single `model.joblib` file containing the full preprocessing + classifier pipeline.
24
+
25
+ ---
26
+
27
+ ## Evaluation Metrics (stored in `meta.json`)
28
+
29
+ All models are evaluated on a **held-out test split** and include the following:
30
+
31
+ ### Discrimination
32
+ - ROC AUC
33
+ - ROC curve (FPR, TPR, thresholds)
34
+ - Precision–Recall curve
35
+ - Average Precision (AP)
36
+
37
+ ### Classification (default threshold = 0.5)
38
+ - Sensitivity (Recall)
39
+ - Specificity
40
+ - Precision
41
+ - F1 score
42
+ - Accuracy
43
+ - Balanced accuracy
44
+ - Confusion matrix (TP, FP, TN, FN)
45
+
46
+ ### Calibration
47
+ - Calibration (reliability) curve
48
+ - Brier score
49
+ - Configurable binning strategy (uniform / quantile)
50
+
51
+ ### Clinical Utility
52
+ - Decision Curve Analysis (DCA)
53
+ - Net benefit curves:
54
+ - Model
55
+ - Treat-all
56
+ - Treat-none
57
+
58
+ All metrics and curve data are stored explicitly in `meta.json` for reproducibility and downstream analysis.
59
+
60
+ ---
61
+
62
+ ## Repository Structure
63
+
64
+ releases/
65
+ └── <version>/
66
+ β”œβ”€β”€ model.joblib # trained sklearn pipeline
67
+ └── meta.json # schema + metrics + curves
68
+ latest/
69
+ β”œβ”€β”€ model.joblib
70
+ └── meta.json
71
+ README.md