title: Explainable-Acute-Leukemia-Mortality-Predictor
emoji: 🧬
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
license: apache-2.0
Explainable Acute Leukemia Mortality Predictor
Explainable Acute Leukemia Mortality Predictor is an interactive, end-to-end clinical machine-learning platform for building, validating, and deploying transparent, interpretable mortality prediction models for patients with acute leukemia.
The system integrates:
- Statistical modeling
- Explainable AI (SHAP)
- Bootstrap internal validation
- External clinical validation
into a single workflow specifically designed for clinicians and clinical researchers.
This tool enables rapid development of clinically trustworthy, publication-grade models without requiring programming expertise.
⭐ Quick Start – Single Patient Mortality Prediction
To predict mortality probability for an individual patient:
- Open the Predict + SHAP (2️⃣ Predict) tab
- Enter patient details across:
- Core
- Clinical (Yes/No)
- NGS
- FISH
- Click Predict single patient
The system will automatically generate:
- Predicted mortality probability (0–1)
- Risk band (Low / Intermediate / High)
- SHAP explanation showing which variables contributed most to the prediction
- Downloadable results and plots
This enables transparent, patient-level, clinically interpretable risk estimation in seconds.
Core Capabilities
Model Development
- Logistic regression–based pipelines (scikit-learn)
- Automatic preprocessing:
- Numeric → median imputation + scaling
- Categorical → most-frequent imputation + one-hot encoding
- Schema-aware training directly from Excel
- Optional L1 feature selection
- Optional dimensionality reduction (SVD)
Explainability (Transparent AI)
- SHAP-based local explanations for each patient
- Global feature importance (bar + beeswarm)
- Waterfall plots for single predictions
- Variable-level contribution tracking
- Fully auditable predictions from raw inputs → probability
Designed for clinical interpretability, not black-box modeling.
Validation Framework (Clinical-Grade)
Unlike typical ML demos, this framework implements rigorous statistical validation appropriate for clinical research.
Discrimination
- ROC AUC
- ROC curves
- Precision–Recall curves
- Average Precision (PR-AUC)
Calibration
- Reliability (calibration) curves
- Brier score
Clinical Utility
- Decision Curve Analysis (net benefit)
Threshold Metrics
- Sensitivity / specificity
- F1 score
- Balanced accuracy
- Confusion matrix
- Optimal threshold selection
Internal Validation (Bootstrapping)
The platform supports bootstrap out-of-bag (OOB) internal validation, which is preferred over simple train/test splits for small clinical datasets.
For each bootstrap iteration:
- Resample patients with replacement
- Train on the bootstrap sample
- Evaluate on out-of-bag patients
- Aggregate performance
Outputs include:
- Mean metrics
- Median metrics
- 95% confidence intervals
- Per-iteration results (downloadable CSV)
This provides:
- Robust performance estimates
- Reduced optimism bias
- Statistically reliable uncertainty bounds
Suitable for peer-reviewed publication and clinical methodology studies.
External Validation
Independent cohorts can be evaluated directly:
- Automatic probability generation
- Full metrics computation
- ROC / PR / calibration / decision curves
- Patient-level prediction export
Prediction CSVs can be used to generate publication-quality NEJM-style figure panels.
Deployment & Versioning
- One-click publishing to Hugging Face Model Hub
- Timestamped immutable releases
- Automatic
latest/tracking - Portable artifacts:
model.joblibmeta.json(schema + metrics + bootstrap results)
Models can be reused on any Excel file with identical column names.
Workflow
Training
- Upload labeled Excel (
Outcome Event) - Select variable types
- Train model
- Review discrimination + calibration metrics
- Run bootstrap internal validation (recommended)
- Publish versioned model
Prediction / Validation
- Load trained model
- Upload new Excel
- Generate probabilities + risk bands
- Run external validation (if labels present)
- Export results and figures
Intended Users
- Hematology–Oncology clinicians
- Clinical researchers
- Epidemiologists
- Outcomes researchers
- Medical AI investigators
No coding required.
Intended Use
This platform supports:
- Clinical research
- Prognostic modeling
- Explainable AI development
- Educational and methodological purposes
Not a medical device. Not for autonomous clinical decision-making.
Clinical judgment must always prevail.
Design Philosophy
This project prioritizes:
- Interpretability over black-box performance
- Statistical rigor over optimistic metrics
- Reproducibility over ad-hoc experimentation
- Clinical relevance over purely technical novelty
- Transparency over opacity
Every prediction must be explainable and defensible.
Technical Stack
- Python
- Streamlit
- scikit-learn
- SHAP
- Matplotlib
- Hugging Face Spaces + Model Hub
Author
Developed and maintained by
Dr. Syed Naveed
Hematology–Oncology Clinician & Researcher
Focus areas:
- Explainable AI in hematology
- Clinical machine learning validation
- Translational AI for real-world patient care
License
Apache 2.0
For configuration details:
https://huggingface.co/docs/hub/spaces-config-reference