Peptide Stability Predictor
Predict thermal stability of peptide/protein sequences using ESM-2 embeddings.
Model Description
This model predicts the thermal stability (melting temperature proxy) of peptide and protein sequences using frozen ESM-2 embeddings passed through a trained MLP regression head. It was trained on the FLIP Meltome benchmark dataset.
Architecture
| Component | Details |
|---|---|
| Backbone | ESM-2 (esm2_t6_8M_UR50D, 8M parameters, frozen) |
| Embedding dim | 320 |
| MLP Head | Linear(320β256) β ReLU β Dropout(0.1) β Linear(256β128) β ReLU β Dropout(0.1) β Linear(128β1) |
| Output | Normalized stability score |
Training Details
| Property | Value |
|---|---|
| Dataset | FLIP Meltome benchmark |
| Validation RΒ² | 0.616 |
| Epochs | 16 (early stopped from 30) |
| Learning rate | 1e-3 |
| Batch size | 8 |
| Dropout | 0.1 |
Quick Start
Requirements
pip install torch fair-esm huggingface_hub
Usage
import torch
from huggingface_hub import hf_hub_download
# Download model checkpoint
checkpoint_path = hf_hub_download(
repo_id="littleworth/peptide-stability-predictor",
filename="stability_predictor.pt"
)
# Load checkpoint
checkpoint = torch.load(checkpoint_path, map_location="cpu", weights_only=False)
# Download model class
model_file = hf_hub_download(
repo_id="littleworth/peptide-stability-predictor",
filename="stability_predictor.py"
)
# Import model class
import importlib.util
spec = importlib.util.spec_from_file_location("stability_predictor", model_file)
sp_module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(sp_module)
StabilityPredictor = sp_module.StabilityPredictor
# Initialize model (this will download ESM-2 on first run)
model = StabilityPredictor(esm_model="esm2_t6_8M_UR50D")
# Load trained weights (only the MLP head, ESM-2 is frozen)
# Filter to only load head weights
head_state_dict = {k: v for k, v in checkpoint['model_state_dict'].items()
if k.startswith('head.')}
model.head.load_state_dict({k.replace('head.', ''): v for k, v in head_state_dict.items()})
model.eval()
# Predict stability
sequences = [
"MKTLYFLGASV",
"AEITVKLSPGMNCF",
"GFLWKASTDERIPMNCVYH",
]
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)
with torch.no_grad():
scores = model(sequences)
print("Stability predictions:")
for seq, score in zip(sequences, scores.tolist()):
print(f" {seq}: {score:.4f}")
Alternative: Using predict() method
# Using the convenience method (returns Python list)
scores = model.predict(sequences)
print(scores) # [0.7234, 0.6521, 0.5892]
Example Output
Stability predictions:
MKTLYFLGASV: 0.7234
AEITVKLSPGMNCF: 0.6521
GFLWKASTDERIPMNCVYH: 0.5892
Files in This Repository
| File | Description |
|---|---|
stability_predictor.pt |
Model checkpoint (MLP head weights) |
stability_predictor.py |
Model architecture definition |
config.json |
Model configuration |
Checkpoint Contents
{
'epoch': 16,
'model_state_dict': {...}, # MLP head weights
'optimizer_state_dict': {...},
'val_r2': 0.616,
'config': {
'esm_model': 'esm2_t6_8M_UR50D',
'hidden_dims': [256, 128],
'dropout': 0.1
}
}
Intended Use
- Primary use: Scoring peptide/protein stability for drug discovery
- Secondary uses:
- Filtering generated peptide candidates
- Research on protein thermostability
- Feature engineering for downstream ML models
Limitations
- Trained on FLIP Meltome data which may not generalize to all protein families
- Outputs normalized scores, not absolute melting temperatures
- Predictions are computational estimates requiring experimental validation
- Best accuracy for sequences similar to training distribution
Performance
| Metric | Value |
|---|---|
| Validation RΒ² | 0.616 |
| Training epochs | 16 |
| Early stopping patience | 15 |
Dependencies
- PyTorch >= 2.0
- fair-esm (Facebook's ESM library)
- huggingface_hub
Ethical Considerations
This model provides computational predictions of protein stability. Predictions should be validated experimentally before making decisions about therapeutic development. The model does not guarantee accuracy for sequences outside its training distribution.
Training Data
- FLIP Meltome benchmark: A dataset of protein sequences with measured thermal stability values
- Training/validation split following FLIP benchmark protocols
Citation
@software{peptide_stability_2025,
author = {Wijaya, Edward},
title = {Peptide Stability Predictor},
year = {2025},
url = {https://huggingface.co/littleworth/peptide-stability-predictor},
note = {ESM-2 based thermal stability prediction}
}
References
- FLIP Benchmark - Dallago et al., 2021
- ESM-2 - Lin et al., 2022
- ESM-2 Paper - Lin et al., Science 2023
License
MIT License
- Downloads last month
- 7