File size: 1,140 Bytes
1f12fa8 54c5dd2 1f12fa8 52c8622 54c5dd2 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
---
license: apache-2.0
datasets:
- Oxer11/Protein-Function-Annotation
language:
- en
tags:
- Protein Langauge Model
- AI for Drug Discovery
- AI for Science
---
# ESM-S
ESM-S (https://arxiv.org/abs/2402.05856) is a series of structure-informed protein language models, which are trained on remote homology detection tasks for distilling structural information.
The corresponding datasets can be downloaded at https://huggingface.co/datasets/Oxer11/Protein-Function-Annotation.
The codebase can be found at https://github.com/DeepGraphLearning/esm-s.

# Evaluation Performance
Freezing model weights and train a 2-layer MLP on downstream function prediction tasks.

Using ESM-S representations to retrieve similar proteins for function annotation.

# BibTeX
```
@article{zhang2024structureplm,
title={Structure-Informed Protein Language Model},
author={Zhang, Zuobai and Lu, Jiarui and Chenthamarakshan, Vijil and Lozano, Aurelie and Das, Payel and Tang, Jian},
journal={arXiv preprint arXiv:2402.05856},
year={2024}
}
``` |