Peptide Stability Predictor

Predict thermal stability of peptide/protein sequences using ESM-2 embeddings.

Model Description

This model predicts the thermal stability (melting temperature proxy) of peptide and protein sequences using frozen ESM-2 embeddings passed through a trained MLP regression head. It was trained on the FLIP Meltome benchmark dataset.

Architecture

Component Details
Backbone ESM-2 (esm2_t6_8M_UR50D, 8M parameters, frozen)
Embedding dim 320
MLP Head Linear(320β†’256) β†’ ReLU β†’ Dropout(0.1) β†’ Linear(256β†’128) β†’ ReLU β†’ Dropout(0.1) β†’ Linear(128β†’1)
Output Normalized stability score

Training Details

Property Value
Dataset FLIP Meltome benchmark
Validation RΒ² 0.616
Epochs 16 (early stopped from 30)
Learning rate 1e-3
Batch size 8
Dropout 0.1

Quick Start

Requirements

pip install torch fair-esm huggingface_hub

Usage

import torch
from huggingface_hub import hf_hub_download

# Download model checkpoint
checkpoint_path = hf_hub_download(
    repo_id="littleworth/peptide-stability-predictor",
    filename="stability_predictor.pt"
)

# Load checkpoint
checkpoint = torch.load(checkpoint_path, map_location="cpu", weights_only=False)

# Download model class
model_file = hf_hub_download(
    repo_id="littleworth/peptide-stability-predictor",
    filename="stability_predictor.py"
)

# Import model class
import importlib.util
spec = importlib.util.spec_from_file_location("stability_predictor", model_file)
sp_module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(sp_module)
StabilityPredictor = sp_module.StabilityPredictor

# Initialize model (this will download ESM-2 on first run)
model = StabilityPredictor(esm_model="esm2_t6_8M_UR50D")

# Load trained weights (only the MLP head, ESM-2 is frozen)
# Filter to only load head weights
head_state_dict = {k: v for k, v in checkpoint['model_state_dict'].items()
                   if k.startswith('head.')}
model.head.load_state_dict({k.replace('head.', ''): v for k, v in head_state_dict.items()})
model.eval()

# Predict stability
sequences = [
    "MKTLYFLGASV",
    "AEITVKLSPGMNCF",
    "GFLWKASTDERIPMNCVYH",
]

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = model.to(device)

with torch.no_grad():
    scores = model(sequences)

print("Stability predictions:")
for seq, score in zip(sequences, scores.tolist()):
    print(f"  {seq}: {score:.4f}")

Alternative: Using predict() method

# Using the convenience method (returns Python list)
scores = model.predict(sequences)
print(scores)  # [0.7234, 0.6521, 0.5892]

Example Output

Stability predictions:
  MKTLYFLGASV: 0.7234
  AEITVKLSPGMNCF: 0.6521
  GFLWKASTDERIPMNCVYH: 0.5892

Files in This Repository

File Description
stability_predictor.pt Model checkpoint (MLP head weights)
stability_predictor.py Model architecture definition
config.json Model configuration

Checkpoint Contents

{
    'epoch': 16,
    'model_state_dict': {...},  # MLP head weights
    'optimizer_state_dict': {...},
    'val_r2': 0.616,
    'config': {
        'esm_model': 'esm2_t6_8M_UR50D',
        'hidden_dims': [256, 128],
        'dropout': 0.1
    }
}

Intended Use

  • Primary use: Scoring peptide/protein stability for drug discovery
  • Secondary uses:
    • Filtering generated peptide candidates
    • Research on protein thermostability
    • Feature engineering for downstream ML models

Limitations

  • Trained on FLIP Meltome data which may not generalize to all protein families
  • Outputs normalized scores, not absolute melting temperatures
  • Predictions are computational estimates requiring experimental validation
  • Best accuracy for sequences similar to training distribution

Performance

Metric Value
Validation RΒ² 0.616
Training epochs 16
Early stopping patience 15

Dependencies

  • PyTorch >= 2.0
  • fair-esm (Facebook's ESM library)
  • huggingface_hub

Ethical Considerations

This model provides computational predictions of protein stability. Predictions should be validated experimentally before making decisions about therapeutic development. The model does not guarantee accuracy for sequences outside its training distribution.

Training Data

  • FLIP Meltome benchmark: A dataset of protein sequences with measured thermal stability values
  • Training/validation split following FLIP benchmark protocols

Citation

@software{peptide_stability_2025,
  author = {Wijaya, Edward},
  title = {Peptide Stability Predictor},
  year = {2025},
  url = {https://huggingface.co/littleworth/peptide-stability-predictor},
  note = {ESM-2 based thermal stability prediction}
}

References

License

MIT License

Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support