Instructions to use dkarthikeyan1/mint-stage1-affinity with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use dkarthikeyan1/mint-stage1-affinity with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("dkarthikeyan1/mint-stage1-affinity", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
MINT Stage 1 pMHC-I Binding Affinity Model
Model Description
This model is a binding affinity prediction model of peptide-MHC class I (pMHC-I) complexes given peptide and full MHC-I sequences. It is a fine-tuned variant of the MINT model introduced in Ullanat et al. 2026, an ESM2-650M with cross-chain multimer attention and pretrained on PPIs. This model was fine-tuned on NetMHCpan 4.1 binding affinity data (~126K samples) and returns predicted binding affinity scores in the [0, 1] scale (log50k-normalized, where higher = stronger binding). It is released along with this paper.
Intended uses & limitations
This model is used for demonstrating the potential of transfer learning of stability values from an affinity trained model. This means that the model assumes a plausible pMHC is provided as input. We have not tested the model on peptides and MHC sequences where either the peptide or the MHC are sufficiently corrupted or out of distribution. This model is NOT vetted for calibration beyond minimized validation loss and should therefore ONLY be used for academic purposes and should NOT be used in a clinical setting.
Usage
import torch
from transformers import AutoModel
# Load model
model = AutoModel.from_pretrained("dkarthikeyan1/mint-stage1-affinity", trust_remote_code=True)
model.eval()
# Tokenize a peptide-MHC pair
from transformers.dynamic_module_utils import get_class_from_dynamic_module
MintTokenizer = get_class_from_dynamic_module(
"modeling_mint_stability.MintTokenizer",
"dkarthikeyan1/mint-stage1-affinity",
trust_remote_code=True,
)
tokenizer = MintTokenizer()
peptide = "GILGFVFTL"
mhc_sequence = "MAVMAPRTLLLLLSGALALTQTWAG..." # full MHC-I heavy chain sequence
chains, chain_ids = tokenizer.prepare_input(peptide, mhc_sequence)
chains = chains.unsqueeze(0) # add batch dim
chain_ids = chain_ids.unsqueeze(0)
# Predict
with torch.no_grad():
output = model(chains, chain_ids)
predicted_affinity = output["logits"].item() # normalized BA score in [0,1], higher = stronger binder
print(f"Predicted binding affinity: {predicted_affinity:.4f}")
Batch inference
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"
model = model.to(device)
peptides = ["GILGFVFTL", "NLVPMVATV"]
mhc_sequences = ["MAVMAPRTL...", "MAVMAPRTL..."] # full sequences
chains, chain_ids = tokenizer.prepare_batch(peptides, mhc_sequences)
with torch.no_grad():
output = model(chains.to(device), chain_ids.to(device))
predictions = output["logits"].squeeze(-1) # shape (batch,)
Model Details
Input Format
- Peptide: Standard amino acid sequence (8-15 residues typical for MHC-I)
- MHC sequence: Full MHC class I heavy chain sequence (~365 residues), NOT pseudo-sequences
- The tokenizer handles concatenation, special tokens (
<cls>,<eos>), and chain ID assignment (peptide=0, MHC=1)
Architecture
| Parameter | Value |
|---|---|
| Backbone | ESM2-650M (33 layers, 1280 dim, 20 heads) |
| Multimer attention | Yes (cross-chain) |
| Projection hidden dim | 512 |
| Projection dropout | 0.2 |
| Freeze percent (training) | 0.5 (layers 0-16 frozen) |
| Output | Scalar binding affinity score |
| Total parameters | ~814M |
Training procedure
Preprocessing
Amino acids were standardized to fit the ESM-2 tokenizer. MHC allele information was standardized using mhcgnomes, available here before mapping allele information to the consensus HLA
as found in IMGT.
Pre-training
MINT (Ullanat et al. 2026) was pretrained on 96 million physical protein–protein interactions from the STRING database (v12.0), clustered at 50% sequence identity to reduce redundancy, using a masked language modeling objective (15% token masking) with interaction-aware supervision. Operationally, the model receives concatenated protein pair sequences with chain ID labels, enabling the cross-chain attention heads to learn interaction-specific representations. The resulting checkpoint (mint.ckpt) serves as the initialization for all downstream fine-tuning stages.
Finetuning
Stage 1 fine-tunes the MINT backbone on peptide–MHC binding affinity, framed as a regression task with
mean-squared-error loss against the normalized affinity score 1 − log(IC50_nM) / log(50000) ∈ [0, 1].
The mean-pooled pMHC representation is passed through a small projection head (Linear → ReLU → Dropout → Linear)
to a scalar. Optimization uses AdamW with gradient-norm clipping (1.0) and a ReduceLROnPlateau schedule;
the lower 50% of backbone layers were frozen (freeze_percent=0.5), leaving 394 M / 814 M parameters trainable.
Full hyperparameters are in the manuscript and its accompanying Supplementary Information (SI).
Citation
@article{dkarthikeyan2026stability,
author = {Karthikeyan, Dhuvarakesh and Vincent, Benjamin and Rubinsteyn, Alexander},
title = {Peptide:MHC Binding Stability Prediction Using Protein Language Models},
elocation-id = {2026.06.28.735023},
year = {2026},
doi = {10.64898/2026.06.28.735023},
publisher = {Cold Spring Harbor Laboratory},
abstract = {Peptide:MHC class I (pMHC-I) binding stability governs the persistence of antigenic complexes at the cell surface and plays a key role in facilitating downstream immunological signals such as antigen presentation, T-cell activation, and immunodominance. However, methods for in silico stability prediction remain underexplored relative to binding affinity prediction, in part because available half-life datasets are sparse and expensive to collect. Here, we perform a systematic reassessment of pMHC-I stability prediction using controlled, similarity-aware data splits and apply a recently introduced supervised transfer-learning strategy to MINT, an interaction-aware protein language model, pretrained on binding affinity and fine-tuned for quantitative half-life prediction. We show that MINT improves stability prediction over standard ESM-2 representations and existing predictors, and that assay-conditioned recalibration corrects systematic shifts across experimental measurement modalities. Across eluted ligand, immunogenicity, and personalized neoantigen prioritization benchmarks, predicted stability provides signal beyond binding affinity, enriching for naturally presented and immunogenic peptides within affinity-filtered candidate sets. These results establish pMHC-I half-life as an orthogonal and transferable biophysical signal connecting peptide binding, surface presentation, and T-cell recognition, and provide a leakage-aware, assay-aware framework for future antigen-presentation modeling.Competing Interest StatementThe authors have declared no competing interest.},
URL = {https://www.biorxiv.org/content/early/2026/06/29/2026.06.28.735023},
eprint = {https://www.biorxiv.org/content/early/2026/06/29/2026.06.28.735023.full.pdf},
journal = {bioRxiv}
}
License
MIT License. See the MINT repository for the original codebase.
- Downloads last month
- 35