EspressoPro ADT Cell Type Models
Model Summary
This repository provides pre-trained EspressoPro models for cell type annotation from single-cell surface protein (ADT) data, designed for blood and bone marrow mononuclear cells in protein-only settings, including Mission Bio Tapestri DNA+ADT workflows.
The pipeline is available at: https://github.com/uom-eoh-lab-published/2026__EspressoPro
The release contains one-vs-rest (OvR) binary classifiers for each cell type.
Each binary classifier is Platt-calibrated independently on the CAL split containing both positive and negative examples.
The resulting OvR probabilities are assembled into a multiclass predictor and further calibrated using temperature scaling.
Models are provided for three annotation resolutions of increasing biological detail.
Model Details
- Developed by: Kristian Gurashi
- Model type: Stacked ensemble OvR classifiers with per-head Platt calibration and multiclass temperature scaling
(logistic regression stacker over XGB, NB, KNN, and MLP prediction probabilities) - Input: Per-cell ADT feature vectors (CLR-normalised surface protein expression)
- Output: Per-cell class probabilities and predicted cell type labels
Included Files
The repository is organised by reference atlas (Hao, Luecken, Triana, Zhang) and by label resolution (Broad, Simplified, Detailed).
Each atlas/resolution folder contains (i) the trained models, (ii) evaluation reports, and (iii) figures.
Models (<Atlas>/Release/<Resolution>/Models/)
Multiclass_models.joblib
Main file for inference. Loads the components needed to run predictions for that atlas/resolution:- all per-class OvR heads, Platt-calibrated where calibration was possible
class_namesdefining the trained/predictable classes and probability column order- multiclass temperature-scaling calibrator
class_names.csv
Ordered list of class labels corresponding to the probability columns output byMulticlass_models.joblib.Temperature_scaler.joblib
Multiclass temperature-scaling calibrator fitted on the CAL split.
Intermediate Per-Class Models (<Atlas>/Release/<Resolution>/Tmp_models/<ClassName>/)
Each class folder contains the individual base learners and stacking models used to build the OvR head:
Scaler.joblibβ feature scalerXGB.joblibβ XGBoost base learnerNB.joblibβ Naive Bayes base learnerKNN.joblibβ K-Nearest Neighbours base learnerMLP.joblibβ Multi-layer Perceptron base learnerStacker_raw.joblibβ logistic regression stacked OvR classifier/head before Platt calibrationStacker_platt.joblibβ Platt-calibrated stacked OvR classifier/head, where calibration was possible
Reports (<Atlas>/Release/<Resolution>/Reports/)
Metrics/
Multiclass_models_confusion_matrix_on_test.csvβ multiclass confusion matrix on the held-out test splitMulticlass_models_metrics_on_test.csvβ multiclass precision, recall, F1-score, support, and accuracy on the held-out test splitSingle_classes_metrics_and_confusion_matrix_on_test.csvβ per-class TP/FP/TN/FN and precision/recall/F1/AUC on the held-out test splitSingle_classes_metrics_pre_and_post_platt_calibration.csvβ per-class LogLoss and Brier score before vs. after Platt calibration
Probabilities/
Multiclass_models_probabilities_on_test.csvβ per-cell final multiclass predicted probabilities on the test set after temperature scaling
Importances/
All_classes_hyperparameters.csvβ hyperparameters used across all classes<ClassName>_hyperparameters.csvβ per-class hyperparameters for each trained classBase_learner_agreement.csvβ pairwise agreement between base learnersBase_learner_test_performance.csvβ individual base learner performance on the test setCV_fold_scores_per_base_learner.csvβ cross-validation fold scores per base learnerKNN_Permutation_Feature_importances.csvβ permutation feature importances from KNNMLP_Permutation_Feature_importances.csvβ permutation feature importances from MLPNB_EffectSize_Feature_importances.csvβ effect-size feature importances from Naive BayesSHAP_XGB_Feature_importances.csvβ SHAP feature importances from XGBoostLR_MetaLearner_BaseLearner_contributions.csvβ logistic regression stacker weights over base learnersTraining_and_inference_runtime.csvβ wall-clock time for training and inference
Figures (<Atlas>/Release/<Resolution>/Figures/)
Multiclass_models_confusion_matrix_on_test.png
Multiclass confusion matrix on the held-out test split.Multiclass_models_confusion_matrix_on_test_with_percentage_agreement.png(Simplified and Detailed only)
Multiclass confusion matrix with percentage agreement between true and predicted labels.Multiclass_models_confusion_matrix_on_test_with_percentage_agreement.pdf(Simplified and Detailed only)
PDF version of the above.Single_classes/
Per-class diagnostic plots:<Class>_RAW_confusion_matrix_on_test.pngβ binary confusion matrix before Platt calibration<Class>_RAW_ROC_on_test.pngβ ROC curve and AUC before Platt calibration<Class>_CAL_confusion_matrix_on_test.pngβ binary confusion matrix after final calibration<Class>_CAL_ROC_on_test.pngβ ROC curve and AUC after final calibration<Class>_Platt_calibration_evaluation_on_test.pngβ calibration curve comparing RAW vs PLATT probabilities<Class>_SHAP_beeswarm_TRAIN.pngβ SHAP beeswarm plot on the training split<Class>_Class_Train_data.png(Hao Broad and Simplified only) β UMAP of the training split coloured by class<Class>_Class_Train_data_legend.png(Hao Broad and Simplified only) β legend for the UMAP
Cell Types by Atlas and Resolution
Broad (all atlases)
Two classes: Immature, Mature
Simplified
| Atlas | Classes |
|---|---|
| Hao | B, CD4_T, CD8_T, cDC, Erythroid, HSPC, Monocyte, NK, Other_T, pDC, Plasma |
| Luecken | B, CD4_T, CD8_T, cDC, Erythroid, HSPC, Monocyte, Myeloid, NK, Other_T, pDC, Plasma |
| Triana | B, CD4_T, CD8_T, cDC, Erythroid, HSPC, Monocyte, Myeloid, NK, Other_T, pDC, Plasma |
| Zhang | B, CD4_T, CD8_T, cDC, Erythroid, HSPC, Monocyte, Myeloid, NK, pDC, Plasma |
Detailed
| Atlas | Classes |
|---|---|
| Hao | B_Memory, B_Naive, CD14_Mono, CD16_Mono, CD4_CTL, CD4_T_Memory, CD4_T_Naive, CD8_T_Memory, CD8_T_Naive, cDC1, cDC2, Erythroblast, GdT, HSC_MPP, Immature_B, MAIT, NK_CD56_bright, NK_CD56_dim, pDC, Plasma, Treg |
| Luecken | B_Memory, B_Naive, CD14_Mono, CD16_Mono, CD4_CTL, CD4_T_Memory, CD4_T_Naive, CD8_T_Memory, CD8_T_Naive, cDC1, cDC2, ErP, Erythroblast, GdT, GMP, HSC_MPP, Immature_B, MAIT, MEP, Myeloid_precursor, NK_CD56_bright, NK_CD56_dim, pDC, Plasma, Pre-Pro-B, Pro-B, Treg |
| Triana | B_Memory, B_Naive, CD14_Mono, CD16_Mono, CD4_CTL, CD4_T_Memory, CD4_T_Naive, CD8_T_Memory, CD8_T_Naive, cDC1, cDC2, EoBaMaP, ErP, Erythroblast, GdT, GMP, HSC_MPP, Immature_B, LMPP, MAIT, MEP, MkP, Myeloid_precursor, NK_CD56_bright, NK_CD56_dim, pDC, Plasma, Pre-B, Pre-Pro-B, Pro-B, Treg |
| Zhang | B_Naive, CD14_Mono, CD16_Mono, CD4_T_Memory, CD4_T_Naive, CD8_T_Memory, CD8_T_Naive, cDC1, cDC2, EoBaMaP, ErP, Erythroblast, GMP, HSC_MPP, Immature_B, LMPP, MAIT, MEP, MkP, Myeloid_precursor, NK_CD56_bright, NK_CD56_dim, pDC, Plasma, Pre-B, Pre-Pro-B, Pro-B |
Uses
Direct Use
Leveraged by EspressoPro to annotate cell types from ADT-only single-cell data from blood/bone marrow mononuclear cells, including Mission Bio Tapestri DNA+ADT datasets.
Bias, Risks, and Limitations
- Reference bias: Models were trained on human healthy donor PBMC/BMMC-derived references; performance may differ in disease or heavily perturbed samples. The models are not expected to work well in other tissues.
- Panel dependence: The models require feature alignment to the expected ADT columns; missing or mismatched antibodies can reduce accuracy.
- Class coverage: Only classes that led to effective predictions from at least one of the four atlases were trained for prediction. Class availability varies by atlas and resolution (see table above).
- Interpretation: Probabilities are model-derived and should be validated with marker checks and expected biology.
Testing Data, Factors & Metrics
Testing Data
- TRAIN: used to train one-vs-rest (OvR) classifiers.
- CAL: used only for probability calibration, including per-class Platt calibration and multiclass temperature scaling.
- TEST: used only for evaluation.
Note: CAL and TEST include only the classes learned from TRAIN; excluded or unknown labels are removed.
Factors
- RAW: OvR probabilities before Platt calibration.
- PLATT: OvR probabilities after Platt calibration on CAL, where calibration was possible.
- CAL: final multiclass probabilities after temperature scaling, fitted on CAL and applied to TEST.
Metrics
Multiclass prediction metrics (TEST, using final CAL probabilities):
- Accuracy
- Precision / Recall / F1-score
- Support
- Confusion matrix
Per-class prediction metrics (TEST, RAW vs final CAL):
- Confusion matrix (TP, FP, TN, FN)
- Precision, recall, F1-score
- ROC curve and AUC
Per-class calibration metrics (TEST, RAW vs PLATT):
- LogLoss and Brier score before vs. after Platt calibration