| | --- |
| | title: Explainable-Acute-Leukemia-Mortality-Predictor |
| | emoji: 🧬 |
| | colorFrom: blue |
| | colorTo: green |
| | sdk: docker |
| | pinned: false |
| | license: apache-2.0 |
| | --- |
| | |
| | # Explainable Acute Leukemia Mortality Predictor |
| |
|
| | **Explainable Acute Leukemia Mortality Predictor** is an interactive, end-to-end clinical machine-learning platform for building, validating, and deploying **transparent, interpretable mortality prediction models** for patients with **acute leukemia**. |
| |
|
| | The system integrates: |
| |
|
| | - Statistical modeling |
| | - Explainable AI (SHAP) |
| | - Bootstrap internal validation |
| | - External clinical validation |
| |
|
| | into a single workflow specifically designed for **clinicians and clinical researchers**. |
| |
|
| | This tool enables rapid development of **clinically trustworthy, publication-grade models** without requiring programming expertise. |
| |
|
| | --- |
| |
|
| | ## ⭐ Quick Start – Single Patient Mortality Prediction |
| |
|
| | To **predict mortality probability for an individual patient**: |
| |
|
| | 1. Open the **Predict + SHAP (2️⃣ Predict)** tab |
| | 2. Enter patient details across: |
| | - **Core** |
| | - **Clinical (Yes/No)** |
| | - **NGS** |
| | - **FISH** |
| | 3. Click **Predict single patient** |
| |
|
| | The system will automatically generate: |
| | - Predicted mortality probability (0–1) |
| | - Risk band (Low / Intermediate / High) |
| | - SHAP explanation showing which variables contributed most to the prediction |
| | - Downloadable results and plots |
| |
|
| | This enables **transparent, patient-level, clinically interpretable risk estimation** in seconds. |
| |
|
| | ## Core Capabilities |
| |
|
| | ### Model Development |
| | - Logistic regression–based pipelines (scikit-learn) |
| | - Automatic preprocessing: |
| | - Numeric → median imputation + scaling |
| | - Categorical → most-frequent imputation + one-hot encoding |
| | - Schema-aware training directly from Excel |
| | - Optional L1 feature selection |
| | - Optional dimensionality reduction (SVD) |
| |
|
| | --- |
| |
|
| | ### Explainability (Transparent AI) |
| | - SHAP-based local explanations for each patient |
| | - Global feature importance (bar + beeswarm) |
| | - Waterfall plots for single predictions |
| | - Variable-level contribution tracking |
| | - Fully auditable predictions from raw inputs → probability |
| |
|
| | Designed for **clinical interpretability**, not black-box modeling. |
| |
|
| | --- |
| |
|
| | ## Validation Framework (Clinical-Grade) |
| |
|
| | Unlike typical ML demos, this framework implements **rigorous statistical validation appropriate for clinical research**. |
| |
|
| | ### Discrimination |
| | - ROC AUC |
| | - ROC curves |
| | - Precision–Recall curves |
| | - Average Precision (PR-AUC) |
| |
|
| | ### Calibration |
| | - Reliability (calibration) curves |
| | - Brier score |
| |
|
| | ### Clinical Utility |
| | - Decision Curve Analysis (net benefit) |
| |
|
| | ### Threshold Metrics |
| | - Sensitivity / specificity |
| | - F1 score |
| | - Balanced accuracy |
| | - Confusion matrix |
| | - Optimal threshold selection |
| |
|
| | --- |
| |
|
| | ## Internal Validation (Bootstrapping) |
| |
|
| | The platform supports **bootstrap out-of-bag (OOB) internal validation**, which is preferred over simple train/test splits for small clinical datasets. |
| |
|
| | For each bootstrap iteration: |
| | 1. Resample patients with replacement |
| | 2. Train on the bootstrap sample |
| | 3. Evaluate on out-of-bag patients |
| | 4. Aggregate performance |
| |
|
| | Outputs include: |
| | - Mean metrics |
| | - Median metrics |
| | - 95% confidence intervals |
| | - Per-iteration results (downloadable CSV) |
| |
|
| | This provides: |
| | - Robust performance estimates |
| | - Reduced optimism bias |
| | - Statistically reliable uncertainty bounds |
| |
|
| | Suitable for **peer-reviewed publication** and **clinical methodology studies**. |
| |
|
| | --- |
| |
|
| | ## External Validation |
| |
|
| | Independent cohorts can be evaluated directly: |
| |
|
| | - Automatic probability generation |
| | - Full metrics computation |
| | - ROC / PR / calibration / decision curves |
| | - Patient-level prediction export |
| |
|
| | Prediction CSVs can be used to generate **publication-quality NEJM-style figure panels**. |
| |
|
| | --- |
| |
|
| | ## Deployment & Versioning |
| | - One-click publishing to Hugging Face Model Hub |
| | - Timestamped immutable releases |
| | - Automatic `latest/` tracking |
| | - Portable artifacts: |
| | - `model.joblib` |
| | - `meta.json` (schema + metrics + bootstrap results) |
| |
|
| | Models can be reused on any Excel file with identical column names. |
| |
|
| | --- |
| |
|
| | ## Workflow |
| |
|
| | ### Training |
| | 1. Upload labeled Excel (`Outcome Event`) |
| | 2. Select variable types |
| | 3. Train model |
| | 4. Review discrimination + calibration metrics |
| | 5. Run bootstrap internal validation (recommended) |
| | 6. Publish versioned model |
| |
|
| | ### Prediction / Validation |
| | 1. Load trained model |
| | 2. Upload new Excel |
| | 3. Generate probabilities + risk bands |
| | 4. Run external validation (if labels present) |
| | 5. Export results and figures |
| |
|
| | --- |
| |
|
| | ## Intended Users |
| | - Hematology–Oncology clinicians |
| | - Clinical researchers |
| | - Epidemiologists |
| | - Outcomes researchers |
| | - Medical AI investigators |
| |
|
| | No coding required. |
| |
|
| | --- |
| |
|
| | ## Intended Use |
| |
|
| | This platform supports: |
| | - Clinical research |
| | - Prognostic modeling |
| | - Explainable AI development |
| | - Educational and methodological purposes |
| |
|
| | **Not a medical device. Not for autonomous clinical decision-making.** |
| |
|
| | Clinical judgment must always prevail. |
| |
|
| | --- |
| |
|
| | ## Design Philosophy |
| |
|
| | This project prioritizes: |
| |
|
| | - Interpretability over black-box performance |
| | - Statistical rigor over optimistic metrics |
| | - Reproducibility over ad-hoc experimentation |
| | - Clinical relevance over purely technical novelty |
| | - Transparency over opacity |
| |
|
| | Every prediction must be explainable and defensible. |
| |
|
| | --- |
| |
|
| | ## Technical Stack |
| | - Python |
| | - Streamlit |
| | - scikit-learn |
| | - SHAP |
| | - Matplotlib |
| | - Hugging Face Spaces + Model Hub |
| |
|
| | --- |
| |
|
| | ## Author |
| |
|
| | Developed and maintained by |
| | **Dr. Syed Naveed** |
| | Hematology–Oncology Clinician & Researcher |
| |
|
| | Focus areas: |
| | - Explainable AI in hematology |
| | - Clinical machine learning validation |
| | - Translational AI for real-world patient care |
| |
|
| | --- |
| |
|
| | ## License |
| | Apache 2.0 |
| |
|
| | --- |
| |
|
| | For configuration details: |
| | https://huggingface.co/docs/hub/spaces-config-reference |
| |
|
| |
|