Safetensors
File size: 1,860 Bytes
6b80b64
3a7e688
6b80b64
4f6e48c
738a381
4f6e48c
 
 
58a54b7
6b80b64
3a7e688
6b80b64
3a7e688
 
0af985f
6b80b64
3a7e688
 
0af985f
 
 
6b80b64
3a7e688
 
6b80b64
3a7e688
 
6b80b64
3a7e688
 
0af985f
ce7dc8c
3a7e688
0af985f
6b80b64
3a7e688
0af985f
 
ef7a005
6b80b64
3a7e688
6b80b64
3a7e688
 
6b80b64
3a7e688
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
---
license: cc-by-nc-nd-4.0
---

# PepDoRA: A Unified Peptide-Specific Language Model via Weight-Decomposed Low-Rank Adaptation

![image/png](https://cdn-uploads.huggingface.co/production/uploads/64cd5b3f0494187a9e8b7c69/fzsxEjCdBJfKa6T44Tjc8.png)

In this work, we introduce **PepDoRA**, a SMILES transformer that fine-tunes the state-of-the-art [ChemBERTa-77M-MLM](https://huggingface.co/DeepChem/ChemBERTa-77M-MLM) transformer on modified peptide SMILES via [DoRA](https://nbasyl.github.io/DoRA-project-page/), a novel PEFT method that incorporates weight decomposition. These representations can be leveraged for numerous downstream tasks, including membrane permeability prediction and target binding assessment, for both unmodified and modified peptide sequences.

Here's how to extract PepDoRA embeddings for your input peptide:

```
import torch
from transformers import AutoTokenizer, AutoModel

# Load the model and tokenizer
model_name = "ChatterjeeLab/PepDoRA"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name, output_hidden_states=True)
   

# Input peptide sequence
peptide = "CC(C)C[C@H]1NC(=O)[C@@H](C)NCCCCCCNC(=O)[C@H](CO)NC1=O"

# Tokenize the peptide
inputs = tokenizer(peptide, return_tensors="pt")

# Get the hidden states (embeddings) from the model
with torch.no_grad():
    outputs = model(**inputs,output_hidden_states=True)
   
# Extract the embeddings from the last hidden layer
embeddng=outputs.last_hidden_state

# Print the embedding shape (or the embedding itself)
print(outputs.last_hidden_state.shape)
print(embeddng)
```

## Repository Authors

[Leyao Wang](mailto:leyao.wang@vanderbilt.edu), Undergraduate Intern in the Chatterjee Lab <br>
[Pranam Chatterjee](mailto:pranam.chatterjee@duke.edu), Assistant Professor at Duke University 

Reach out to us with any questions!