Synav commited on
Commit
cb14f09
·
verified ·
1 Parent(s): 1c1b492

Update app.py

Browse files
Files changed (1) hide show
  1. app.py +112 -67
app.py CHANGED
@@ -1012,74 +1012,119 @@ FIGSIZE = (plot_width, plot_height)
1012
  st.title("Explainable-Acute-Leukemia-Mortality-Predictor")
1013
  st.caption("Explainable clinical AI for mortality and outcome prediction in acute leukemia using SHAP-interpretable models")
1014
 
1015
- with st.expander("About this framework, who can use it, and required Excel format", expanded=True):
1016
  st.markdown("""
1017
- ## What is this framework?
1018
- This is a **no-code, explainable machine-learning MODEL** that allows you to upload an Excel sheet, train a predictive model instantly, and obtain **transparent, variable-level explanations** of the prediction.
1019
-
1020
- You provide:
1021
- - Predictor variables as Excel columns
1022
- - A binary outcome in column **`Outcome Event`**
1023
-
1024
- The system automatically:
1025
- - Trains a validated logistic regression–based model
1026
- - Performs internal and external validation (if labels are present)
1027
- - Generates **explainability (SHAP)** showing which variables contribute most to predictions
1028
- - Allows reuse of the trained model on new Excel sheets with identical column names
1029
-
1030
- ---
1031
-
1032
- ## Who can use this?
1033
- This framework is designed for:
1034
- - **Clinicians and physician-scientists**
1035
- - **Clinical researchers**
1036
- - **Epidemiologists and outcomes researchers**
1037
- - **Health-data and AI researchers**
1038
-
1039
- No programming or machine-learning expertise is required.
1040
- All modeling, validation, and explainability are handled automatically.
1041
-
1042
- ---
1043
-
1044
- ## Training Excel (with labels)
1045
- - **First row must contain column names**
1046
- - **All columns except `Outcome Event`** → model input features (predictors)
1047
- - **`Outcome Event`** binary outcome label to be predicted
1048
- - Accepted formats: `0/1`, `Yes/No`, `True/False`
1049
-
1050
- ---
1051
-
1052
- ## Variable type selection (during training)
1053
- - You will explicitly choose which predictors are:
1054
- - **Numeric** (median imputation + scaling)
1055
- - **Categorical** (most-frequent imputation + one-hot encoding)
1056
- - This variable-type schema is **saved with the trained model** and enforced during all future predictions.
1057
-
1058
- ---
1059
-
1060
- ## Prediction / External validation Excel
1061
- - Must contain the **same predictor columns** with **identical names** as the trained model
1062
- - **Do NOT include `Outcome Event`** if you only want predictions
1063
- - **Include `Outcome Event`** if you want **external validation metrics**, including:
1064
- - ROC AUC
1065
- - Sensitivity / specificity
1066
- - Precision–recall
1067
- - Calibration
1068
- - Decision curve analysis
1069
- - Confusion matrix
1070
-
1071
- ---
1072
-
1073
- ## Explainability and downloads
1074
- - For both training and validation, the framework provides **SHAP-based explanations** indicating:
1075
- - Which variables contribute most to each prediction
1076
- - Direction and magnitude of influence
1077
- - You can download:
1078
- - **Prediction output sheets** (probabilities, classes, risk bands)
1079
- - **All plots individually** (ROC, PR, calibration, DCA, SHAP)
1080
- - Plots are exportable as **high-resolution PNG (≥600 DPI)** for publications
1081
-
1082
- """)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1083
 
1084
  st.warning(
1085
  "Prediction will fail if feature names or variable types "
 
1012
  st.title("Explainable-Acute-Leukemia-Mortality-Predictor")
1013
  st.caption("Explainable clinical AI for mortality and outcome prediction in acute leukemia using SHAP-interpretable models")
1014
 
1015
+ with st.expander("About this AI model, who can use it, and required Excel format", expanded=True):
1016
  st.markdown("""
1017
+ ## What is this framework?
1018
+
1019
+ This is a **clinically oriented, explainable AI platform** for developing and validating **mortality and outcome prediction models in acute leukemia** using structured Excel data.
1020
+
1021
+ The system integrates:
1022
+
1023
+ • Statistical modeling (logistic regression)
1024
+ Explainable AI (SHAP)
1025
+ Bootstrap internal validation
1026
+ External clinical validation
1027
+ • Publication-ready performance reporting
1028
+
1029
+ into a **single no-code workflow** designed for clinicians and researchers.
1030
+
1031
+ The goal is to produce **transparent, trustworthy, and clinically interpretable predictions**, rather than black-box outputs.
1032
+
1033
+ ---
1034
+
1035
+ ## What does it do automatically?
1036
+
1037
+ After uploading your Excel file, the platform will:
1038
+
1039
+ ### Model development
1040
+ Train a logistic-regression–based clinical prediction model
1041
+ • Handle preprocessing automatically
1042
+  – numeric → imputation + scaling
1043
+  – categorical → imputation + one-hot encoding
1044
+ Save the full schema to ensure reproducibility
1045
+
1046
+ ### Validation
1047
+ ROC AUC and ROC curves
1048
+ Precision–Recall curves
1049
+ • Calibration curves + Brier score
1050
+ • Decision Curve Analysis (clinical net benefit)
1051
+ • Sensitivity / specificity / F1 / balanced accuracy
1052
+ Threshold optimisation
1053
+
1054
+ ### Internal validation (recommended)
1055
+ Bootstrap out-of-bag validation (multiple resamples)
1056
+ 95% confidence intervals for metrics
1057
+ • Reduced optimism bias for small clinical datasets
1058
+
1059
+ ### Explainability
1060
+ SHAP feature importance
1061
+ • Patient-level waterfall plots
1062
+ Global and local explanations
1063
+
1064
+ ### Deployment
1065
+ • One-click publishing of trained models
1066
+ Reuse the same model on future Excel sheets
1067
+ Download predictions, plots, and reports
1068
+
1069
+ All plots are exportable as **high-resolution (≥600 DPI) publication-ready figures**.
1070
+
1071
+ ---
1072
+
1073
+ ## Who can use this?
1074
+
1075
+ This framework is intended for:
1076
+
1077
+ Hematology–Oncology clinicians
1078
+ Clinical researchers
1079
+ Epidemiologists
1080
+ Outcomes researchers
1081
+ • Students learning explainable AI
1082
+
1083
+ No programming or machine-learning expertise is required.
1084
+
1085
+ ---
1086
+
1087
+ ## Required Excel format
1088
+
1089
+ ### Training file (with labels)
1090
+ • First row must contain column names
1091
+ • All columns except **Outcome Event** → predictor variables
1092
+ • **Outcome Event** → binary label
1093
+  Accepted formats: 0/1, Yes/No, True/False
1094
+
1095
+ ### Variable type selection
1096
+ During training you explicitly choose:
1097
+ • Numeric variables
1098
+ • Categorical variables
1099
+
1100
+ This schema is saved with the model and **must match future files exactly**.
1101
+
1102
+ ---
1103
+
1104
+ ### Prediction / External validation file
1105
+ Must contain:
1106
+ • Same predictor column names as the trained model
1107
+
1108
+ Optional:
1109
+ • Include **Outcome Event** to compute full external validation metrics
1110
+
1111
+ If labels are included, the system will automatically generate:
1112
+ • ROC
1113
+ • Calibration
1114
+ • Decision curves
1115
+ • Confusion matrix
1116
+ • Clinical performance metrics
1117
+
1118
+ ---
1119
+
1120
+ ## Important note
1121
+
1122
+ This tool is for **research and decision-support only**.
1123
+ It is **not a medical device** and must not replace clinical judgment.
1124
+ """)
1125
+
1126
+
1127
+
1128
 
1129
  st.warning(
1130
  "Prediction will fail if feature names or variable types "