MINT Stage 1 pMHC-I Binding Affinity Model

Model Description

This model is a binding affinity prediction model of peptide-MHC class I (pMHC-I) complexes given peptide and full MHC-I sequences. It is a fine-tuned variant of the MINT model introduced in Ullanat et al. 2026, an ESM2-650M with cross-chain multimer attention and pretrained on PPIs. This model was fine-tuned on NetMHCpan 4.1 binding affinity data (~126K samples) and returns predicted binding affinity scores in the [0, 1] scale (log50k-normalized, where higher = stronger binding). It is released along with this paper.

Intended uses & limitations

This model is used for demonstrating the potential of transfer learning of stability values from an affinity trained model. This means that the model assumes a plausible pMHC is provided as input. We have not tested the model on peptides and MHC sequences where either the peptide or the MHC are sufficiently corrupted or out of distribution. This model is NOT vetted for calibration beyond minimized validation loss and should therefore ONLY be used for academic purposes and should NOT be used in a clinical setting.

Usage

import torch
from transformers import AutoModel

# Load model
model = AutoModel.from_pretrained("dkarthikeyan1/mint-stage1-affinity", trust_remote_code=True)
model.eval()

# Tokenize a peptide-MHC pair
from transformers.dynamic_module_utils import get_class_from_dynamic_module
MintTokenizer = get_class_from_dynamic_module(
    "modeling_mint_stability.MintTokenizer",
    "dkarthikeyan1/mint-stage1-affinity",
    trust_remote_code=True,
)
tokenizer = MintTokenizer()
peptide = "GILGFVFTL"
mhc_sequence = "MAVMAPRTLLLLLSGALALTQTWAG..."  # full MHC-I heavy chain sequence

chains, chain_ids = tokenizer.prepare_input(peptide, mhc_sequence)
chains = chains.unsqueeze(0)        # add batch dim
chain_ids = chain_ids.unsqueeze(0)

# Predict
with torch.no_grad():
    output = model(chains, chain_ids)
    predicted_affinity = output["logits"].item() # normalized BA score in [0,1], higher = stronger binder

print(f"Predicted binding affinity: {predicted_affinity:.4f}")

Batch inference

import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)

peptides = ["GILGFVFTL", "NLVPMVATV"]
mhc_sequences = ["MAVMAPRTL...", "MAVMAPRTL..."]  # full sequences

chains, chain_ids = tokenizer.prepare_batch(peptides, mhc_sequences)
with torch.no_grad():
    output = model(chains.to(device), chain_ids.to(device))
    predictions = output["logits"].squeeze(-1)     # shape (batch,)

Model Details

Input Format

Peptide: Standard amino acid sequence (8-15 residues typical for MHC-I)
MHC sequence: Full MHC class I heavy chain sequence (~365 residues), NOT pseudo-sequences
The tokenizer handles concatenation, special tokens (<cls>, <eos>), and chain ID assignment (peptide=0, MHC=1)

Architecture

Parameter	Value
Backbone	ESM2-650M (33 layers, 1280 dim, 20 heads)
Multimer attention	Yes (cross-chain)
Projection hidden dim	512
Projection dropout	0.2
Freeze percent (training)	0.5 (layers 0-16 frozen)
Output	Scalar binding affinity score
Total parameters	~814M

Training procedure

Preprocessing

Amino acids were standardized to fit the ESM-2 tokenizer. MHC allele information was standardized using mhcgnomes, available here before mapping allele information to the consensus HLA as found in IMGT.

Pre-training

MINT (Ullanat et al. 2026) was pretrained on 96 million physical protein–protein interactions from the STRING database (v12.0), clustered at 50% sequence identity to reduce redundancy, using a masked language modeling objective (15% token masking) with interaction-aware supervision. Operationally, the model receives concatenated protein pair sequences with chain ID labels, enabling the cross-chain attention heads to learn interaction-specific representations. The resulting checkpoint (mint.ckpt) serves as the initialization for all downstream fine-tuning stages.

Finetuning

Stage 1 fine-tunes the MINT backbone on peptide–MHC binding affinity, framed as a regression task with mean-squared-error loss against the normalized affinity score 1 − log(IC50_nM) / log(50000) ∈ [0, 1]. The mean-pooled pMHC representation is passed through a small projection head (Linear → ReLU → Dropout → Linear) to a scalar. Optimization uses AdamW with gradient-norm clipping (1.0) and a ReduceLROnPlateau schedule; the lower 50% of backbone layers were frozen (freeze_percent=0.5), leaving 394 M / 814 M parameters trainable. Full hyperparameters are in the manuscript and its accompanying Supplementary Information (SI).

Citation

@article{dkarthikeyan2026stability,
    author = {Karthikeyan, Dhuvarakesh and Vincent, Benjamin and Rubinsteyn, Alexander},
    title = {Peptide:MHC Binding Stability Prediction Using Protein Language Models},
    elocation-id = {2026.06.28.735023},
    year = {2026},
    doi = {10.64898/2026.06.28.735023},
    publisher = {Cold Spring Harbor Laboratory},
    abstract = {Peptide:MHC class I (pMHC-I) binding stability governs the persistence of antigenic complexes at the cell surface and plays a key role in facilitating downstream immunological signals such as antigen presentation, T-cell activation, and immunodominance. However, methods for in silico stability prediction remain underexplored relative to binding affinity prediction, in part because available half-life datasets are sparse and expensive to collect. Here, we perform a systematic reassessment of pMHC-I stability prediction using controlled, similarity-aware data splits and apply a recently introduced supervised transfer-learning strategy to MINT, an interaction-aware protein language model, pretrained on binding affinity and fine-tuned for quantitative half-life prediction. We show that MINT improves stability prediction over standard ESM-2 representations and existing predictors, and that assay-conditioned recalibration corrects systematic shifts across experimental measurement modalities. Across eluted ligand, immunogenicity, and personalized neoantigen prioritization benchmarks, predicted stability provides signal beyond binding affinity, enriching for naturally presented and immunogenic peptides within affinity-filtered candidate sets. These results establish pMHC-I half-life as an orthogonal and transferable biophysical signal connecting peptide binding, surface presentation, and T-cell recognition, and provide a leakage-aware, assay-aware framework for future antigen-presentation modeling.Competing Interest StatementThe authors have declared no competing interest.},
    URL = {https://www.biorxiv.org/content/early/2026/06/29/2026.06.28.735023},
    eprint = {https://www.biorxiv.org/content/early/2026/06/29/2026.06.28.735023.full.pdf},
    journal = {bioRxiv}
}

License

MIT License. See the MINT repository for the original codebase.

Downloads last month: 35

Safetensors

Model size

0.8B params

Tensor type

F32

Inference Providers NEW

Other

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including dkarthikeyan1/mint-stage1-affinity

🌿🧬🧫 SPEARMINT

Collection

Stability Prediction of Epitopes with Assay Recalibration using MINT. • 3 items • Updated 3 days ago