Synav's picture
Update README.md
c7ca32b verified
---
license: apache-2.0
---
# Explainable Acute Leukemia Mortality Predictor – Model Repository
This repository contains the **trained machine learning model artifacts** generated by the
**Explainable Acute Leukemia Mortality Predictor** Hugging Face Space.
It serves exclusively as a **persistent storage and versioning registry** for models developed for:
**Mortality risk prediction in patients with acute leukemia using structured clinical data.**
This repository does **not** provide training or an interactive interface.
---
## Relationship to the Application
Model development, validation, and prediction occur in the companion Space:
**Synav/Explainable-Acute-Leukemia-Mortality-Predictor**
Because Hugging Face Spaces use temporary storage, trained models are automatically:
1. Saved
2. Versioned
3. Uploaded here
4. Preserved as permanent releases
This ensures:
* reproducibility
* auditability
* long-term persistence
* external validation capability
---
## Model Description
Each stored model is:
* **Task:** Binary mortality prediction (Yes/No)
* **Algorithm:** Logistic Regression (scikit-learn)
* **Output:** Probability of mortality (0–1)
* **Explainability:** SHAP feature attribution
### Embedded preprocessing
Numeric variables
* median imputation
* standard scaling
Categorical variables
* most-frequent imputation
* one-hot encoding
All preprocessing steps are embedded within the pipeline to guarantee:
* identical inference behavior
* schema consistency
* zero manual preprocessing
---
## Files Included per Release
Each version folder contains:
### model.joblib
Complete scikit-learn pipeline including preprocessing, feature encoding, and the trained classifier.
Ready for immediate inference.
### meta.json
Structured metadata including:
* feature schema
* variable types
* evaluation metrics
* ROC/PR curve data
* calibration statistics
* confusion matrix
* decision curve analysis
* validation configuration
These artifacts enable full reproducibility and downstream analysis.
---
## Evaluation Metrics Captured
Models are evaluated on held-out test data using clinical-grade performance criteria.
### Discrimination
* ROC AUC
* ROC curve
* Precision–Recall curve
* Average Precision
### Classification
* Sensitivity (Recall)
* Specificity
* Precision
* F1 score
* Accuracy
* Balanced accuracy
* Confusion matrix
### Calibration
* Calibration (reliability) curve
* Brier score
### Clinical Utility
* Decision Curve Analysis (net benefit)
---
## Repository Structure
```
releases/
└── <version>/
β”œβ”€β”€ model.joblib
└── meta.json
latest/
β”œβ”€β”€ model.joblib
└── meta.json
README.md
```
* **releases/<version>/** β†’ immutable historical snapshots
* **latest/** β†’ most recent validated model
---
## Intended Use
These artifacts are intended for:
* Clinical research
* Risk stratification studies
* Independent external validation
* Multi-center reproducibility testing
* Educational and exploratory analysis
---
## Not Intended For
These models:
* are not regulatory-approved medical devices
* do not replace clinician judgment
* should not be used for autonomous decision-making
* require local validation prior to clinical deployment
Clinical oversight is mandatory.
---
## Loading a Model
```python
import joblib
model = joblib.load("model.joblib")
proba = model.predict_proba(X)[:, 1]
```
No additional preprocessing is required.
---
## Author
Dr. Syed Naveed
Hematology & Oncology
Sheikh Shakhbout Medical City
Abu Dhabi, UAE
---
## License
Apache 2.0
---