MINT Stage 2 pMHC-I Stability Model

Model Description

This model is a binding stability (half-life) prediction model of peptide-MHC class I (pMHC-I) complexes given peptide and full MHC-I sequences. It is a fine-tuned variant of the MINT model introduced in Ullanat et al. 2026, an ESM2-650M with cross-chain multimer attention and pretrained on PPIs. This model was produced by transfer learning from the Stage 1 binding affinity model onto pMHC-I stability (half-life) data (~21.6K training samples), and returns predicted half-life in log1p(hours) scale.

Intended uses & limitations

This is a research checkpoint for predicting peptide–MHC class I binding stability (half-life) from sequence alone. It serves as the initialization for the Stage 3 (SPEARMINT) assay-conditioned model and is not intended to be a standalone model. This model was trained exclusively on SPA-assay half-life measurements collected at 37 °C, across roughly 72 HLA class I alleles. As a result, it is calibrated to that single measurement condition. Predictions for other assay modalities (for example fluorescence-based purified or cellular assays) or other temperatures carry systematic shifts that this model does not correct. Likewise, predictions for alleles outside the training distribution, or for complexes far longer-lived than the observed range, are extrapolations and should be treated with caution.

Usage

import math
import torch
from transformers import AutoModel

# Load model
model = AutoModel.from_pretrained("dkarthikeyan1/mint-stage2-stability", trust_remote_code=True)
model.eval()

# Tokenize a peptide-MHC pair
from transformers.dynamic_module_utils import get_class_from_dynamic_module
MintTokenizer = get_class_from_dynamic_module(
    "modeling_mint_stability.MintTokenizer",
    "dkarthikeyan1/mint-stage2-stability",
    trust_remote_code=True,
)
tokenizer = MintTokenizer()
peptide = "GILGFVFTL"
mhc_sequence = "MAVMAPRTLLLLLSGALALTQTWAG..."  # full MHC-I heavy chain sequence

chains, chain_ids = tokenizer.prepare_input(peptide, mhc_sequence)
chains = chains.unsqueeze(0)        # add batch dim
chain_ids = chain_ids.unsqueeze(0)

# Predict
with torch.no_grad():
    output = model(chains, chain_ids)
    log_pred = output["logits"].item()              # model outputs log1p(half-life in hours)
    predicted_halflife_hours = math.expm1(log_pred)

print(f"Predicted half-life: {predicted_halflife_hours:.2f} hours")

Batch inference

import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)

peptides = ["GILGFVFTL", "NLVPMVATV"]
mhc_sequences = ["MAVMAPRTL...", "MAVMAPRTL..."]  # full sequences

chains, chain_ids = tokenizer.prepare_batch(peptides, mhc_sequences)
with torch.no_grad():
    output = model(chains.to(device), chain_ids.to(device))
    predictions_hours = torch.expm1(output["logits"].squeeze(-1))   # half-life in hours, shape (batch,)

Model Details

Input Format

Peptide: Standard amino acid sequence (8-15 residues)
MHC sequence: Full MHC class I heavy chain sequence (~365 residues), NOT pseudo-sequences
The tokenizer handles concatenation, special tokens (<cls>, <eos>), and chain ID assignment (peptide=0, MHC=1)

Architecture Details

Parameter	Value
Backbone	ESM2-650M (33 layers, 1280 dim, 20 heads)
Multimer attention	Yes (cross-chain)
Projection hidden dim	512
Projection dropout	0.2
Freeze percent (training)	0.7 (layers 0-23 frozen)
Label transform	`log1p(half_life_hours)` — apply `expm1()` to invert
Output	Scalar (log1p scale, unbounded)
Total parameters	~814M

Training Procedure

Preprocessing

Amino acids were standardized to fit the ESM-2 tokenizer. MHC allele information was standardized using mhcgnomes, available here before mapping allele information to the consensus HLA as found in IMGT.

Pre-training

MINT (Ullanat et al. 2026) was pretrained on 96 million physical protein–protein interactions from the STRING database (v12.0), clustered at 50% sequence identity to reduce redundancy, using a masked language modeling objective (15% token masking) with interaction-aware supervision. Operationally, the model receives concatenated protein pair sequences with chain ID labels, enabling the cross-chain attention heads to learn interaction-specific representations. The resulting checkpoint (mint.ckpt) serves as the initialization for all downstream fine-tuning stages.

Finetuning

Stage 2 transfer-learns from the Stage 1 binding-affinity checkpoint, fine-tuning on peptide–MHC stability, framed as a regression task with mean-squared-error loss against log1p(half-life hours) labels (--log_transform). The mean-pooled pMHC representation is passed through a small projection head (Linear → ReLU → Dropout → Linear) to a scalar. Optimization uses AdamW with gradient-norm clipping (1.0) and a ReduceLROnPlateau schedule; the lower 70% of backbone layers were frozen (freeze_percent=0.7), leaving 222 M / 814 M parameters trainable. Full hyperparameters are in the manuscript and its accompanying Supplementary Information (SI).

Citation

@article{dkarthikeyan2026stability,
    author = {Karthikeyan, Dhuvarakesh and Vincent, Benjamin and Rubinsteyn, Alexander},
    title = {Peptide:MHC Binding Stability Prediction Using Protein Language Models},
    elocation-id = {2026.06.28.735023},
    year = {2026},
    doi = {10.64898/2026.06.28.735023},
    publisher = {Cold Spring Harbor Laboratory},
    abstract = {Peptide:MHC class I (pMHC-I) binding stability governs the persistence of antigenic complexes at the cell surface and plays a key role in facilitating downstream immunological signals such as antigen presentation, T-cell activation, and immunodominance. However, methods for in silico stability prediction remain underexplored relative to binding affinity prediction, in part because available half-life datasets are sparse and expensive to collect. Here, we perform a systematic reassessment of pMHC-I stability prediction using controlled, similarity-aware data splits and apply a recently introduced supervised transfer-learning strategy to MINT, an interaction-aware protein language model, pretrained on binding affinity and fine-tuned for quantitative half-life prediction. We show that MINT improves stability prediction over standard ESM-2 representations and existing predictors, and that assay-conditioned recalibration corrects systematic shifts across experimental measurement modalities. Across eluted ligand, immunogenicity, and personalized neoantigen prioritization benchmarks, predicted stability provides signal beyond binding affinity, enriching for naturally presented and immunogenic peptides within affinity-filtered candidate sets. These results establish pMHC-I half-life as an orthogonal and transferable biophysical signal connecting peptide binding, surface presentation, and T-cell recognition, and provide a leakage-aware, assay-aware framework for future antigen-presentation modeling.Competing Interest StatementThe authors have declared no competing interest.},
    URL = {https://www.biorxiv.org/content/early/2026/06/29/2026.06.28.735023},
    eprint = {https://www.biorxiv.org/content/early/2026/06/29/2026.06.28.735023.full.pdf},
    journal = {bioRxiv}
}

License

MIT License. See the MINT repository for the original codebase.

Downloads last month: 23

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including dkarthikeyan1/mint-stage2-stability

🌿🧬🧫 SPEARMINT

Collection

Stability Prediction of Epitopes with Assay Recalibration using MINT. • 3 items • Updated 3 days ago