File size: 3,610 Bytes
544b211
 
 
c4bef3b
c7ca32b
c4bef3b
c7ca32b
 
 
 
 
 
 
 
c4bef3b
 
 
c7ca32b
 
 
 
 
c4bef3b
c7ca32b
c4bef3b
c7ca32b
 
 
 
 
 
 
 
 
 
 
c4bef3b
 
 
c7ca32b
 
 
 
 
 
 
 
c4bef3b
c7ca32b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c4bef3b
 
c7ca32b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c4bef3b
 
c7ca32b
 
 
c4bef3b
 
 
c7ca32b
c4bef3b
 
 
 
 
c7ca32b
c4bef3b
c7ca32b
 
 
 
c4bef3b
c7ca32b
 
 
c4bef3b
c7ca32b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
---
license: apache-2.0
---

# Explainable Acute Leukemia Mortality Predictor – Model Repository

This repository contains the **trained machine learning model artifacts** generated by the
**Explainable Acute Leukemia Mortality Predictor** Hugging Face Space.

It serves exclusively as a **persistent storage and versioning registry** for models developed for:

**Mortality risk prediction in patients with acute leukemia using structured clinical data.**

This repository does **not** provide training or an interactive interface.

---

## Relationship to the Application

Model development, validation, and prediction occur in the companion Space:

**Synav/Explainable-Acute-Leukemia-Mortality-Predictor**

Because Hugging Face Spaces use temporary storage, trained models are automatically:

1. Saved
2. Versioned
3. Uploaded here
4. Preserved as permanent releases

This ensures:

* reproducibility
* auditability
* long-term persistence
* external validation capability

---

## Model Description

Each stored model is:

* **Task:** Binary mortality prediction (Yes/No)
* **Algorithm:** Logistic Regression (scikit-learn)
* **Output:** Probability of mortality (0–1)
* **Explainability:** SHAP feature attribution

### Embedded preprocessing

Numeric variables

* median imputation
* standard scaling

Categorical variables

* most-frequent imputation
* one-hot encoding

All preprocessing steps are embedded within the pipeline to guarantee:

* identical inference behavior
* schema consistency
* zero manual preprocessing

---

## Files Included per Release

Each version folder contains:

### model.joblib

Complete scikit-learn pipeline including preprocessing, feature encoding, and the trained classifier.
Ready for immediate inference.

### meta.json

Structured metadata including:

* feature schema
* variable types
* evaluation metrics
* ROC/PR curve data
* calibration statistics
* confusion matrix
* decision curve analysis
* validation configuration

These artifacts enable full reproducibility and downstream analysis.

---

## Evaluation Metrics Captured

Models are evaluated on held-out test data using clinical-grade performance criteria.

### Discrimination

* ROC AUC
* ROC curve
* Precision–Recall curve
* Average Precision

### Classification

* Sensitivity (Recall)
* Specificity
* Precision
* F1 score
* Accuracy
* Balanced accuracy
* Confusion matrix

### Calibration

* Calibration (reliability) curve
* Brier score

### Clinical Utility

* Decision Curve Analysis (net benefit)

---

## Repository Structure

```
releases/
  └── <version>/
      ├── model.joblib
      └── meta.json

latest/
  ├── model.joblib
  └── meta.json

README.md
```

* **releases/<version>/** → immutable historical snapshots
* **latest/** → most recent validated model

---

## Intended Use

These artifacts are intended for:

* Clinical research
* Risk stratification studies
* Independent external validation
* Multi-center reproducibility testing
* Educational and exploratory analysis

---

## Not Intended For

These models:

* are not regulatory-approved medical devices
* do not replace clinician judgment
* should not be used for autonomous decision-making
* require local validation prior to clinical deployment

Clinical oversight is mandatory.

---

## Loading a Model

```python
import joblib

model = joblib.load("model.joblib")
proba = model.predict_proba(X)[:, 1]
```

No additional preprocessing is required.

---

## Author

Dr. Syed Naveed
Hematology & Oncology
Sheikh Shakhbout Medical City
Abu Dhabi, UAE

---

## License

Apache 2.0

---