Synav's picture
Update README.md
a735f3b verified
metadata
title: Explainable-Acute-Leukemia-Mortality-Predictor
emoji: 🧬
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
license: apache-2.0

Explainable Acute Leukemia Mortality Predictor

Explainable Acute Leukemia Mortality Predictor is an interactive, end-to-end clinical machine-learning platform for building, validating, and deploying transparent, interpretable mortality prediction models for patients with acute leukemia.

The system integrates:

  • Statistical modeling
  • Explainable AI (SHAP)
  • Bootstrap internal validation
  • External clinical validation

into a single workflow specifically designed for clinicians and clinical researchers.

This tool enables rapid development of clinically trustworthy, publication-grade models without requiring programming expertise.


⭐ Quick Start – Single Patient Mortality Prediction

To predict mortality probability for an individual patient:

  1. Open the Predict + SHAP (2️⃣ Predict) tab
  2. Enter patient details across:
    • Core
    • Clinical (Yes/No)
    • NGS
    • FISH
  3. Click Predict single patient

The system will automatically generate:

  • Predicted mortality probability (0–1)
  • Risk band (Low / Intermediate / High)
  • SHAP explanation showing which variables contributed most to the prediction
  • Downloadable results and plots

This enables transparent, patient-level, clinically interpretable risk estimation in seconds.

Core Capabilities

Model Development

  • Logistic regression–based pipelines (scikit-learn)
  • Automatic preprocessing:
    • Numeric → median imputation + scaling
    • Categorical → most-frequent imputation + one-hot encoding
  • Schema-aware training directly from Excel
  • Optional L1 feature selection
  • Optional dimensionality reduction (SVD)

Explainability (Transparent AI)

  • SHAP-based local explanations for each patient
  • Global feature importance (bar + beeswarm)
  • Waterfall plots for single predictions
  • Variable-level contribution tracking
  • Fully auditable predictions from raw inputs → probability

Designed for clinical interpretability, not black-box modeling.


Validation Framework (Clinical-Grade)

Unlike typical ML demos, this framework implements rigorous statistical validation appropriate for clinical research.

Discrimination

  • ROC AUC
  • ROC curves
  • Precision–Recall curves
  • Average Precision (PR-AUC)

Calibration

  • Reliability (calibration) curves
  • Brier score

Clinical Utility

  • Decision Curve Analysis (net benefit)

Threshold Metrics

  • Sensitivity / specificity
  • F1 score
  • Balanced accuracy
  • Confusion matrix
  • Optimal threshold selection

Internal Validation (Bootstrapping)

The platform supports bootstrap out-of-bag (OOB) internal validation, which is preferred over simple train/test splits for small clinical datasets.

For each bootstrap iteration:

  1. Resample patients with replacement
  2. Train on the bootstrap sample
  3. Evaluate on out-of-bag patients
  4. Aggregate performance

Outputs include:

  • Mean metrics
  • Median metrics
  • 95% confidence intervals
  • Per-iteration results (downloadable CSV)

This provides:

  • Robust performance estimates
  • Reduced optimism bias
  • Statistically reliable uncertainty bounds

Suitable for peer-reviewed publication and clinical methodology studies.


External Validation

Independent cohorts can be evaluated directly:

  • Automatic probability generation
  • Full metrics computation
  • ROC / PR / calibration / decision curves
  • Patient-level prediction export

Prediction CSVs can be used to generate publication-quality NEJM-style figure panels.


Deployment & Versioning

  • One-click publishing to Hugging Face Model Hub
  • Timestamped immutable releases
  • Automatic latest/ tracking
  • Portable artifacts:
    • model.joblib
    • meta.json (schema + metrics + bootstrap results)

Models can be reused on any Excel file with identical column names.


Workflow

Training

  1. Upload labeled Excel (Outcome Event)
  2. Select variable types
  3. Train model
  4. Review discrimination + calibration metrics
  5. Run bootstrap internal validation (recommended)
  6. Publish versioned model

Prediction / Validation

  1. Load trained model
  2. Upload new Excel
  3. Generate probabilities + risk bands
  4. Run external validation (if labels present)
  5. Export results and figures

Intended Users

  • Hematology–Oncology clinicians
  • Clinical researchers
  • Epidemiologists
  • Outcomes researchers
  • Medical AI investigators

No coding required.


Intended Use

This platform supports:

  • Clinical research
  • Prognostic modeling
  • Explainable AI development
  • Educational and methodological purposes

Not a medical device. Not for autonomous clinical decision-making.

Clinical judgment must always prevail.


Design Philosophy

This project prioritizes:

  • Interpretability over black-box performance
  • Statistical rigor over optimistic metrics
  • Reproducibility over ad-hoc experimentation
  • Clinical relevance over purely technical novelty
  • Transparency over opacity

Every prediction must be explainable and defensible.


Technical Stack

  • Python
  • Streamlit
  • scikit-learn
  • SHAP
  • Matplotlib
  • Hugging Face Spaces + Model Hub

Author

Developed and maintained by
Dr. Syed Naveed
Hematology–Oncology Clinician & Researcher

Focus areas:

  • Explainable AI in hematology
  • Clinical machine learning validation
  • Translational AI for real-world patient care

License

Apache 2.0


For configuration details:
https://huggingface.co/docs/hub/spaces-config-reference