EspressoPro ADT Cell Type Models

Model Summary

This repository provides pre-trained EspressoPro models for cell type annotation from single-cell surface protein (ADT) data, designed for blood and bone marrow mononuclear cells in protein-only settings, including Mission Bio Tapestri DNA+ADT workflows.

The pipeline is available at: https://github.com/uom-eoh-lab-published/2026__EspressoPro

The release contains one-vs-rest (OvR) binary classifiers for each cell type.
Each binary classifier is Platt-calibrated independently on the CAL split containing both positive and negative examples.
The resulting OvR probabilities are assembled into a multiclass predictor and further calibrated using temperature scaling.
Models are provided for three annotation resolutions of increasing biological detail.

Model Details

Developed by: Kristian Gurashi
Model type: Stacked ensemble OvR classifiers with per-head Platt calibration and multiclass temperature scaling
(logistic regression stacker over XGB, NB, KNN, and MLP prediction probabilities)
Input: Per-cell ADT feature vectors (CLR-normalised surface protein expression)
Output: Per-cell class probabilities and predicted cell type labels

Included Files

The repository is organised by reference atlas (Hao, Luecken, Triana, Zhang) and by label resolution (Broad, Simplified, Detailed).
Each atlas/resolution folder contains (i) the trained models, (ii) evaluation reports, and (iii) figures.

Models (`<Atlas>/Release/<Resolution>/Models/`)

Multiclass_models.joblib
Main file for inference. Loads the components needed to run predictions for that atlas/resolution:
- all per-class OvR heads, Platt-calibrated where calibration was possible
- class_names defining the trained/predictable classes and probability column order
- multiclass temperature-scaling calibrator
class_names.csv
Ordered list of class labels corresponding to the probability columns output by Multiclass_models.joblib.
Temperature_scaler.joblib
Multiclass temperature-scaling calibrator fitted on the CAL split.

Intermediate Per-Class Models (`<Atlas>/Release/<Resolution>/Tmp_models/<ClassName>/`)

Each class folder contains the individual base learners and stacking models used to build the OvR head:

Scaler.joblib — feature scaler
XGB.joblib — XGBoost base learner
NB.joblib — Naive Bayes base learner
KNN.joblib — K-Nearest Neighbours base learner
MLP.joblib — Multi-layer Perceptron base learner
Stacker_raw.joblib — logistic regression stacked OvR classifier/head before Platt calibration
Stacker_platt.joblib — Platt-calibrated stacked OvR classifier/head, where calibration was possible

Reports (`<Atlas>/Release/<Resolution>/Reports/`)

Metrics/

Multiclass_models_confusion_matrix_on_test.csv — multiclass confusion matrix on the held-out test split
Multiclass_models_metrics_on_test.csv — multiclass precision, recall, F1-score, support, and accuracy on the held-out test split
Single_classes_metrics_and_confusion_matrix_on_test.csv — per-class TP/FP/TN/FN and precision/recall/F1/AUC on the held-out test split
Single_classes_metrics_pre_and_post_platt_calibration.csv — per-class LogLoss and Brier score before vs. after Platt calibration

Probabilities/

Multiclass_models_probabilities_on_test.csv — per-cell final multiclass predicted probabilities on the test set after temperature scaling

Importances/

All_classes_hyperparameters.csv — hyperparameters used across all classes
<ClassName>_hyperparameters.csv — per-class hyperparameters for each trained class
Base_learner_agreement.csv — pairwise agreement between base learners
Base_learner_test_performance.csv — individual base learner performance on the test set
CV_fold_scores_per_base_learner.csv — cross-validation fold scores per base learner
KNN_Permutation_Feature_importances.csv — permutation feature importances from KNN
MLP_Permutation_Feature_importances.csv — permutation feature importances from MLP
NB_EffectSize_Feature_importances.csv — effect-size feature importances from Naive Bayes
SHAP_XGB_Feature_importances.csv — SHAP feature importances from XGBoost
LR_MetaLearner_BaseLearner_contributions.csv — logistic regression stacker weights over base learners
Training_and_inference_runtime.csv — wall-clock time for training and inference

Figures (`<Atlas>/Release/<Resolution>/Figures/`)

Multiclass_models_confusion_matrix_on_test.png
Multiclass confusion matrix on the held-out test split.
Multiclass_models_confusion_matrix_on_test_with_percentage_agreement.png (Simplified and Detailed only)
Multiclass confusion matrix with percentage agreement between true and predicted labels.
Multiclass_models_confusion_matrix_on_test_with_percentage_agreement.pdf (Simplified and Detailed only)
PDF version of the above.
Single_classes/
Per-class diagnostic plots:
- <Class>_RAW_confusion_matrix_on_test.png — binary confusion matrix before Platt calibration
- <Class>_RAW_ROC_on_test.png — ROC curve and AUC before Platt calibration
- <Class>_CAL_confusion_matrix_on_test.png — binary confusion matrix after final calibration
- <Class>_CAL_ROC_on_test.png — ROC curve and AUC after final calibration
- <Class>_Platt_calibration_evaluation_on_test.png — calibration curve comparing RAW vs PLATT probabilities
- <Class>_SHAP_beeswarm_TRAIN.png — SHAP beeswarm plot on the training split
- <Class>_Class_Train_data.png (Hao Broad and Simplified only) — UMAP of the training split coloured by class
- <Class>_Class_Train_data_legend.png (Hao Broad and Simplified only) — legend for the UMAP

Cell Types by Atlas and Resolution

Broad (all atlases)

Two classes: Immature, Mature

Simplified

Atlas	Classes
Hao	B, CD4_T, CD8_T, cDC, Erythroid, HSPC, Monocyte, NK, Other_T, pDC, Plasma
Luecken	B, CD4_T, CD8_T, cDC, Erythroid, HSPC, Monocyte, Myeloid, NK, Other_T, pDC, Plasma
Triana	B, CD4_T, CD8_T, cDC, Erythroid, HSPC, Monocyte, Myeloid, NK, Other_T, pDC, Plasma
Zhang	B, CD4_T, CD8_T, cDC, Erythroid, HSPC, Monocyte, Myeloid, NK, pDC, Plasma

Detailed

Atlas	Classes
Hao	B_Memory, B_Naive, CD14_Mono, CD16_Mono, CD4_CTL, CD4_T_Memory, CD4_T_Naive, CD8_T_Memory, CD8_T_Naive, cDC1, cDC2, Erythroblast, GdT, HSC_MPP, Immature_B, MAIT, NK_CD56_bright, NK_CD56_dim, pDC, Plasma, Treg
Luecken	B_Memory, B_Naive, CD14_Mono, CD16_Mono, CD4_CTL, CD4_T_Memory, CD4_T_Naive, CD8_T_Memory, CD8_T_Naive, cDC1, cDC2, ErP, Erythroblast, GdT, GMP, HSC_MPP, Immature_B, MAIT, MEP, Myeloid_precursor, NK_CD56_bright, NK_CD56_dim, pDC, Plasma, Pre-Pro-B, Pro-B, Treg
Triana	B_Memory, B_Naive, CD14_Mono, CD16_Mono, CD4_CTL, CD4_T_Memory, CD4_T_Naive, CD8_T_Memory, CD8_T_Naive, cDC1, cDC2, EoBaMaP, ErP, Erythroblast, GdT, GMP, HSC_MPP, Immature_B, LMPP, MAIT, MEP, MkP, Myeloid_precursor, NK_CD56_bright, NK_CD56_dim, pDC, Plasma, Pre-B, Pre-Pro-B, Pro-B, Treg
Zhang	B_Naive, CD14_Mono, CD16_Mono, CD4_T_Memory, CD4_T_Naive, CD8_T_Memory, CD8_T_Naive, cDC1, cDC2, EoBaMaP, ErP, Erythroblast, GMP, HSC_MPP, Immature_B, LMPP, MAIT, MEP, MkP, Myeloid_precursor, NK_CD56_bright, NK_CD56_dim, pDC, Plasma, Pre-B, Pre-Pro-B, Pro-B

Uses

Direct Use

Leveraged by EspressoPro to annotate cell types from ADT-only single-cell data from blood/bone marrow mononuclear cells, including Mission Bio Tapestri DNA+ADT datasets.

Bias, Risks, and Limitations

Reference bias: Models were trained on human healthy donor PBMC/BMMC-derived references; performance may differ in disease or heavily perturbed samples. The models are not expected to work well in other tissues.
Panel dependence: The models require feature alignment to the expected ADT columns; missing or mismatched antibodies can reduce accuracy.
Class coverage: Only classes that led to effective predictions from at least one of the four atlases were trained for prediction. Class availability varies by atlas and resolution (see table above).
Interpretation: Probabilities are model-derived and should be validated with marker checks and expected biology.

Testing Data, Factors & Metrics

Testing Data

TRAIN: used to train one-vs-rest (OvR) classifiers.
CAL: used only for probability calibration, including per-class Platt calibration and multiclass temperature scaling.
TEST: used only for evaluation.

Note: CAL and TEST include only the classes learned from TRAIN; excluded or unknown labels are removed.

Factors

RAW: OvR probabilities before Platt calibration.
PLATT: OvR probabilities after Platt calibration on CAL, where calibration was possible.
CAL: final multiclass probabilities after temperature scaling, fitted on CAL and applied to TEST.

Metrics

Multiclass prediction metrics (TEST, using final CAL probabilities):

Accuracy
Precision / Recall / F1-score
Support
Confusion matrix

Per-class prediction metrics (TEST, RAW vs final CAL):

Confusion matrix (TP, FP, TN, FN)
Precision, recall, F1-score
ROC curve and AUC

Per-class calibration metrics (TEST, RAW vs PLATT):

LogLoss and Brier score before vs. after Platt calibration

Downloads last month: -; Downloads are not tracked for this model. How to track