|
|
--- |
|
|
license: mit |
|
|
tags: |
|
|
- METL |
|
|
- biology |
|
|
- protein |
|
|
--- |
|
|
|
|
|
# METL |
|
|
|
|
|
Mutational Effect Transfer Learning (METL) is a framework for pretraining and finetuning biophysics-informed protein language models. |
|
|
|
|
|
|
|
|
## Model Details |
|
|
|
|
|
This repository contains a wrapper meant to facilitate the ease of use of METL models. |
|
|
Usage of this wrapper will be provided below. |
|
|
Models are hosted on [Zenodo](https://zenodo.org/doi/10.5281/zenodo.11051644) and will be downloaded by this wrapper when used. |
|
|
|
|
|
### Model Description |
|
|
|
|
|
METL is discussed in the [paper](https://doi.org/10.1038/s41592-025-02776-2) in further detail. |
|
|
The GitHub [repo](https://github.com/gitter-lab/metl) contains more documentation and includes scripts for training and predicting with METL. |
|
|
Google Colab notebooks for finetuning and predicting on publicly available METL models are available as well [here](https://github.com/gitter-lab/metl/tree/main/notebooks). |
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Repository:** [METL repo](https://github.com/gitter-lab/metl) |
|
|
- **Paper:** [METL publication](https://doi.org/10.1038/s41592-025-02776-2) |
|
|
- **Demo:** [Hugging Face Spaces demo](https://huggingface.co/spaces/gitter-lab/METL_demo) |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
Use the code below to get started with the model. |
|
|
|
|
|
Running METL requires the following packages: |
|
|
``` |
|
|
transformers==4.42.4 |
|
|
numpy>=1.23.2 |
|
|
networkx>=2.6.3 |
|
|
scipy>=1.9.1 |
|
|
biopandas>=0.2.7 |
|
|
``` |
|
|
|
|
|
In order to run the example, a PDB file for the GB1 protein structure must be installed. |
|
|
It is provided [here](https://github.com/gitter-lab/metl-pretrained/blob/main/pdbs/2qmt_p.pdb) and in raw format [here](https://raw.githubusercontent.com/gitter-lab/metl-pretrained/main/pdbs/2qmt_p.pdb). |
|
|
|
|
|
After installing those packages and downloading the above file, you may run METL with the following code example (assuming the downloaded file is in the same place as the script): |
|
|
|
|
|
```python |
|
|
from transformers import AutoModel |
|
|
import torch |
|
|
|
|
|
metl = AutoModel.from_pretrained('gitter-lab/METL', trust_remote_code=True) |
|
|
|
|
|
|
|
|
model = "metl-l-2m-3d-gb1" |
|
|
wt = "MQYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTE" |
|
|
variants = '["T17P,T54F", "V28L,F51A"]' |
|
|
pdb_path = './2qmt_p.pdb' |
|
|
|
|
|
metl.load_from_ident(model_id) |
|
|
|
|
|
metl.eval() |
|
|
|
|
|
encoded_variants = metl.encoder.encode_variants(sequence, variant) |
|
|
|
|
|
with torch.no_grad(): |
|
|
predictions = metl(torch.tensor(encoded_variants), pdb_fn=pdb_path) |
|
|
|
|
|
``` |
|
|
|
|
|
## Citation |
|
|
|
|
|
[Biophysics-based protein language models for protein engineering](https://doi.org/10.1038/s41592-025-02776-2). |
|
|
Sam Gelman, Bryce Johnson, Chase R Freschlin, Arnav Sharma, Sameer D'Costa, John Peters, Anthony Gitter<sup>+</sup>, Philip A Romero<sup>+</sup>. |
|
|
*Nature Methods* 22, 2025. |
|
|
<sup>+</sup> denotes equal contribution. |
|
|
|
|
|
## Model Card Contact |
|
|
|
|
|
For questions and comments about METL, the best way to reach out is through opening a GitHub issue in the [METL repository](https://github.com/gitter-lab/metl/issues). |