Instructions to use dkarthikeyan1/mint-stage2-stability with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dkarthikeyan1/mint-stage2-stability with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("dkarthikeyan1/mint-stage2-stability", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
MINT Stage 2 pMHC-I Stability Model
Model Description
This model is a binding stability (half-life) prediction model of peptide-MHC class I (pMHC-I) complexes given peptide and full MHC-I sequences.
It is a fine-tuned variant of the MINT model introduced in Ullanat et al. 2026,
an ESM2-650M with cross-chain multimer attention and pretrained on PPIs.
This model was produced by transfer learning from the Stage 1 binding affinity model onto pMHC-I stability
(half-life) data (~21.6K training samples), and returns predicted half-life in log1p(hours) scale.
Intended uses & limitations
This is a research checkpoint for predicting peptide–MHC class I binding stability (half-life) from sequence alone. It serves as the initialization for the Stage 3 (SPEARMINT) assay-conditioned model and is not intended to be a standalone model. This model was trained exclusively on SPA-assay half-life measurements collected at 37 °C, across roughly 72 HLA class I alleles. As a result, it is calibrated to that single measurement condition. Predictions for other assay modalities (for example fluorescence-based purified or cellular assays) or other temperatures carry systematic shifts that this model does not correct. Likewise, predictions for alleles outside the training distribution, or for complexes far longer-lived than the observed range, are extrapolations and should be treated with caution.
Usage
import math
import torch
from transformers import AutoModel
# Load model
model = AutoModel.from_pretrained("dkarthikeyan1/mint-stage2-stability", trust_remote_code=True)
model.eval()
# Tokenize a peptide-MHC pair
from transformers.dynamic_module_utils import get_class_from_dynamic_module
MintTokenizer = get_class_from_dynamic_module(
"modeling_mint_stability.MintTokenizer",
"dkarthikeyan1/mint-stage2-stability",
trust_remote_code=True,
)
tokenizer = MintTokenizer()
peptide = "GILGFVFTL"
mhc_sequence = "MAVMAPRTLLLLLSGALALTQTWAG..." # full MHC-I heavy chain sequence
chains, chain_ids = tokenizer.prepare_input(peptide, mhc_sequence)
chains = chains.unsqueeze(0) # add batch dim
chain_ids = chain_ids.unsqueeze(0)
# Predict
with torch.no_grad():
output = model(chains, chain_ids)
log_pred = output["logits"].item() # model outputs log1p(half-life in hours)
predicted_halflife_hours = math.expm1(log_pred)
print(f"Predicted half-life: {predicted_halflife_hours:.2f} hours")
Batch inference
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)
peptides = ["GILGFVFTL", "NLVPMVATV"]
mhc_sequences = ["MAVMAPRTL...", "MAVMAPRTL..."] # full sequences
chains, chain_ids = tokenizer.prepare_batch(peptides, mhc_sequences)
with torch.no_grad():
output = model(chains.to(device), chain_ids.to(device))
predictions_hours = torch.expm1(output["logits"].squeeze(-1)) # half-life in hours, shape (batch,)
Model Details
Input Format
- Peptide: Standard amino acid sequence (8-15 residues)
- MHC sequence: Full MHC class I heavy chain sequence (~365 residues), NOT pseudo-sequences
- The tokenizer handles concatenation, special tokens (
<cls>,<eos>), and chain ID assignment (peptide=0, MHC=1)
Architecture Details
| Parameter | Value |
|---|---|
| Backbone | ESM2-650M (33 layers, 1280 dim, 20 heads) |
| Multimer attention | Yes (cross-chain) |
| Projection hidden dim | 512 |
| Projection dropout | 0.2 |
| Freeze percent (training) | 0.7 (layers 0-23 frozen) |
| Label transform | log1p(half_life_hours) — apply expm1() to invert |
| Output | Scalar (log1p scale, unbounded) |
| Total parameters | ~814M |
Training Procedure
Preprocessing
Amino acids were standardized to fit the ESM-2 tokenizer. MHC allele information was standardized using mhcgnomes, available here before mapping allele information to the consensus HLA as found in IMGT.
Pre-training
MINT (Ullanat et al. 2026) was pretrained on 96 million physical protein–protein interactions from the STRING database (v12.0), clustered at 50% sequence identity to reduce redundancy, using a masked language modeling objective (15% token masking) with interaction-aware supervision. Operationally, the model receives concatenated protein pair sequences with chain ID labels, enabling the cross-chain attention heads to learn interaction-specific representations. The resulting checkpoint (mint.ckpt) serves as the initialization for all downstream fine-tuning stages.
Finetuning
Stage 2 transfer-learns from the Stage 1 binding-affinity checkpoint, fine-tuning on peptide–MHC stability, framed as a regression task with mean-squared-error loss against log1p(half-life hours) labels (--log_transform). The mean-pooled pMHC representation is passed through a small projection head (Linear → ReLU → Dropout → Linear) to a scalar. Optimization uses AdamW with gradient-norm clipping (1.0) and a ReduceLROnPlateau schedule; the lower 70% of backbone layers were frozen (freeze_percent=0.7), leaving 222 M / 814 M parameters trainable. Full hyperparameters are in the manuscript and its accompanying Supplementary Information (SI).
Citation
@article{dkarthikeyan2026stability,
author = {Karthikeyan, Dhuvarakesh and Vincent, Benjamin and Rubinsteyn, Alexander},
title = {Peptide:MHC Binding Stability Prediction Using Protein Language Models},
elocation-id = {2026.06.28.735023},
year = {2026},
doi = {10.64898/2026.06.28.735023},
publisher = {Cold Spring Harbor Laboratory},
abstract = {Peptide:MHC class I (pMHC-I) binding stability governs the persistence of antigenic complexes at the cell surface and plays a key role in facilitating downstream immunological signals such as antigen presentation, T-cell activation, and immunodominance. However, methods for in silico stability prediction remain underexplored relative to binding affinity prediction, in part because available half-life datasets are sparse and expensive to collect. Here, we perform a systematic reassessment of pMHC-I stability prediction using controlled, similarity-aware data splits and apply a recently introduced supervised transfer-learning strategy to MINT, an interaction-aware protein language model, pretrained on binding affinity and fine-tuned for quantitative half-life prediction. We show that MINT improves stability prediction over standard ESM-2 representations and existing predictors, and that assay-conditioned recalibration corrects systematic shifts across experimental measurement modalities. Across eluted ligand, immunogenicity, and personalized neoantigen prioritization benchmarks, predicted stability provides signal beyond binding affinity, enriching for naturally presented and immunogenic peptides within affinity-filtered candidate sets. These results establish pMHC-I half-life as an orthogonal and transferable biophysical signal connecting peptide binding, surface presentation, and T-cell recognition, and provide a leakage-aware, assay-aware framework for future antigen-presentation modeling.Competing Interest StatementThe authors have declared no competing interest.},
URL = {https://www.biorxiv.org/content/early/2026/06/29/2026.06.28.735023},
eprint = {https://www.biorxiv.org/content/early/2026/06/29/2026.06.28.735023.full.pdf},
journal = {bioRxiv}
}
License
MIT License. See the MINT repository for the original codebase.
- Downloads last month
- 23