---
license: apache-2.0
---

# Explainable Acute Leukemia Mortality Predictor – Model Repository

This repository contains the **trained machine learning model artifacts** generated by the
**Explainable Acute Leukemia Mortality Predictor** Hugging Face Space.

It serves exclusively as a **persistent storage and versioning registry** for models developed for:

**Mortality risk prediction in patients with acute leukemia using structured clinical data.**

This repository does **not** provide training or an interactive interface.

---

## Relationship to the Application

Model development, validation, and prediction occur in the companion Space:

**Synav/Explainable-Acute-Leukemia-Mortality-Predictor**

Because Hugging Face Spaces use temporary storage, trained models are automatically:

1. Saved
2. Versioned
3. Uploaded here
4. Preserved as permanent releases

This ensures:

* reproducibility
* auditability
* long-term persistence
* external validation capability

---

## Model Description

Each stored model is:

* **Task:** Binary mortality prediction (Yes/No)
* **Algorithm:** Logistic Regression (scikit-learn)
* **Output:** Probability of mortality (0–1)
* **Explainability:** SHAP feature attribution

### Embedded preprocessing

Numeric variables

* median imputation
* standard scaling

Categorical variables

* most-frequent imputation
* one-hot encoding

All preprocessing steps are embedded within the pipeline to guarantee:

* identical inference behavior
* schema consistency
* zero manual preprocessing

---

## Files Included per Release

Each version folder contains:

### model.joblib

Complete scikit-learn pipeline including preprocessing, feature encoding, and the trained classifier.
Ready for immediate inference.

### meta.json

Structured metadata including:

* feature schema
* variable types
* evaluation metrics
* ROC/PR curve data
* calibration statistics
* confusion matrix
* decision curve analysis
* validation configuration

These artifacts enable full reproducibility and downstream analysis.

---

## Evaluation Metrics Captured

Models are evaluated on held-out test data using clinical-grade performance criteria.

### Discrimination

* ROC AUC
* ROC curve
* Precision–Recall curve
* Average Precision

### Classification

* Sensitivity (Recall)
* Specificity
* Precision
* F1 score
* Accuracy
* Balanced accuracy
* Confusion matrix

### Calibration

* Calibration (reliability) curve
* Brier score

### Clinical Utility

* Decision Curve Analysis (net benefit)

---

## Repository Structure

```
releases/
  └── <version>/
      ├── model.joblib
      └── meta.json

latest/
  ├── model.joblib
  └── meta.json

README.md
```

* **releases/<version>/** → immutable historical snapshots
* **latest/** → most recent validated model

---

## Intended Use

These artifacts are intended for:

* Clinical research
* Risk stratification studies
* Independent external validation
* Multi-center reproducibility testing
* Educational and exploratory analysis

---

## Not Intended For

These models:

* are not regulatory-approved medical devices
* do not replace clinician judgment
* should not be used for autonomous decision-making
* require local validation prior to clinical deployment

Clinical oversight is mandatory.

---

## Loading a Model

```python
import joblib

model = joblib.load("model.joblib")
proba = model.predict_proba(X)[:, 1]
```

No additional preprocessing is required.

---

## Author

Dr. Syed Naveed
Hematology & Oncology
Sheikh Shakhbout Medical City
Abu Dhabi, UAE

---

## License

Apache 2.0

---