Peptide Stability Predictor

Predict thermal stability of peptide/protein sequences using ESM-2 embeddings.

Model Description

This model predicts the thermal stability (melting temperature proxy) of peptide and protein sequences using frozen ESM-2 embeddings passed through a trained MLP regression head. It was trained on the FLIP Meltome benchmark dataset.

Architecture

Component	Details
Backbone	ESM-2 (esm2_t6_8M_UR50D, 8M parameters, frozen)
Embedding dim	320
MLP Head	Linear(320→256) → ReLU → Dropout(0.1) → Linear(256→128) → ReLU → Dropout(0.1) → Linear(128→1)
Output	Normalized stability score

Training Details

Property	Value
Dataset	FLIP Meltome benchmark
Validation R²	0.616
Epochs	16 (early stopped from 30)
Learning rate	1e-3
Batch size	8
Dropout	0.1

Quick Start

Requirements

pip install torch fair-esm huggingface_hub

Usage

import torch
from huggingface_hub import hf_hub_download

# Download model checkpoint
checkpoint_path = hf_hub_download(
    repo_id="littleworth/peptide-stability-predictor",
    filename="stability_predictor.pt"
)

# Load checkpoint
checkpoint = torch.load(checkpoint_path, map_location="cpu", weights_only=False)

# Download model class
model_file = hf_hub_download(
    repo_id="littleworth/peptide-stability-predictor",
    filename="stability_predictor.py"
)

# Import model class
import importlib.util
spec = importlib.util.spec_from_file_location("stability_predictor", model_file)
sp_module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(sp_module)
StabilityPredictor = sp_module.StabilityPredictor

# Initialize model (this will download ESM-2 on first run)
model = StabilityPredictor(esm_model="esm2_t6_8M_UR50D")

# Load trained weights (only the MLP head, ESM-2 is frozen)
# Filter to only load head weights
head_state_dict = {k: v for k, v in checkpoint['model_state_dict'].items()
                   if k.startswith('head.')}
model.head.load_state_dict({k.replace('head.', ''): v for k, v in head_state_dict.items()})
model.eval()

# Predict stability
sequences = [
    "MKTLYFLGASV",
    "AEITVKLSPGMNCF",
    "GFLWKASTDERIPMNCVYH",
]

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

with torch.no_grad():
    scores = model(sequences)

print("Stability predictions:")
for seq, score in zip(sequences, scores.tolist()):
    print(f"  {seq}: {score:.4f}")

Alternative: Using predict() method

# Using the convenience method (returns Python list)
scores = model.predict(sequences)
print(scores)  # [0.7234, 0.6521, 0.5892]

Example Output

Stability predictions:
  MKTLYFLGASV: 0.7234
  AEITVKLSPGMNCF: 0.6521
  GFLWKASTDERIPMNCVYH: 0.5892

Files in This Repository

File	Description
`stability_predictor.pt`	Model checkpoint (MLP head weights)
`stability_predictor.py`	Model architecture definition
`config.json`	Model configuration

Checkpoint Contents

{
    'epoch': 16,
    'model_state_dict': {...},  # MLP head weights
    'optimizer_state_dict': {...},
    'val_r2': 0.616,
    'config': {
        'esm_model': 'esm2_t6_8M_UR50D',
        'hidden_dims': [256, 128],
        'dropout': 0.1
    }
}

Intended Use

Primary use: Scoring peptide/protein stability for drug discovery
Secondary uses:
- Filtering generated peptide candidates
- Research on protein thermostability
- Feature engineering for downstream ML models

Limitations

Trained on FLIP Meltome data which may not generalize to all protein families
Outputs normalized scores, not absolute melting temperatures
Predictions are computational estimates requiring experimental validation
Best accuracy for sequences similar to training distribution

Performance

Metric	Value
Validation R²	0.616
Training epochs	16
Early stopping patience	15

Dependencies

PyTorch >= 2.0
fair-esm (Facebook's ESM library)
huggingface_hub

Ethical Considerations

This model provides computational predictions of protein stability. Predictions should be validated experimentally before making decisions about therapeutic development. The model does not guarantee accuracy for sequences outside its training distribution.

Training Data

FLIP Meltome benchmark: A dataset of protein sequences with measured thermal stability values
Training/validation split following FLIP benchmark protocols

Citation

@software{peptide_stability_2025,
  author = {Wijaya, Edward},
  title = {Peptide Stability Predictor},
  year = {2025},
  url = {https://huggingface.co/littleworth/peptide-stability-predictor},
  note = {ESM-2 based thermal stability prediction}
}

References

FLIP Benchmark - Dallago et al., 2021
ESM-2 - Lin et al., 2022
ESM-2 Paper - Lin et al., Science 2023

License

MIT License

Downloads last month: 17