--- license: apache-2.0 --- # Explainable Acute Leukemia Mortality Predictor – Model Repository This repository contains the **trained machine learning model artifacts** generated by the **Explainable Acute Leukemia Mortality Predictor** Hugging Face Space. It serves exclusively as a **persistent storage and versioning registry** for models developed for: **Mortality risk prediction in patients with acute leukemia using structured clinical data.** This repository does **not** provide training or an interactive interface. --- ## Relationship to the Application Model development, validation, and prediction occur in the companion Space: **Synav/Explainable-Acute-Leukemia-Mortality-Predictor** Because Hugging Face Spaces use temporary storage, trained models are automatically: 1. Saved 2. Versioned 3. Uploaded here 4. Preserved as permanent releases This ensures: * reproducibility * auditability * long-term persistence * external validation capability --- ## Model Description Each stored model is: * **Task:** Binary mortality prediction (Yes/No) * **Algorithm:** Logistic Regression (scikit-learn) * **Output:** Probability of mortality (0–1) * **Explainability:** SHAP feature attribution ### Embedded preprocessing Numeric variables * median imputation * standard scaling Categorical variables * most-frequent imputation * one-hot encoding All preprocessing steps are embedded within the pipeline to guarantee: * identical inference behavior * schema consistency * zero manual preprocessing --- ## Files Included per Release Each version folder contains: ### model.joblib Complete scikit-learn pipeline including preprocessing, feature encoding, and the trained classifier. Ready for immediate inference. ### meta.json Structured metadata including: * feature schema * variable types * evaluation metrics * ROC/PR curve data * calibration statistics * confusion matrix * decision curve analysis * validation configuration These artifacts enable full reproducibility and downstream analysis. --- ## Evaluation Metrics Captured Models are evaluated on held-out test data using clinical-grade performance criteria. ### Discrimination * ROC AUC * ROC curve * Precision–Recall curve * Average Precision ### Classification * Sensitivity (Recall) * Specificity * Precision * F1 score * Accuracy * Balanced accuracy * Confusion matrix ### Calibration * Calibration (reliability) curve * Brier score ### Clinical Utility * Decision Curve Analysis (net benefit) --- ## Repository Structure ``` releases/ └── / ├── model.joblib └── meta.json latest/ ├── model.joblib └── meta.json README.md ``` * **releases//** → immutable historical snapshots * **latest/** → most recent validated model --- ## Intended Use These artifacts are intended for: * Clinical research * Risk stratification studies * Independent external validation * Multi-center reproducibility testing * Educational and exploratory analysis --- ## Not Intended For These models: * are not regulatory-approved medical devices * do not replace clinician judgment * should not be used for autonomous decision-making * require local validation prior to clinical deployment Clinical oversight is mandatory. --- ## Loading a Model ```python import joblib model = joblib.load("model.joblib") proba = model.predict_proba(X)[:, 1] ``` No additional preprocessing is required. --- ## Author Dr. Syed Naveed Hematology & Oncology Sheikh Shakhbout Medical City Abu Dhabi, UAE --- ## License Apache 2.0 ---