Update architecture description and Cyclosporine A example
Browse files
README.md
CHANGED
|
@@ -11,7 +11,7 @@ tags:
|
|
| 11 |
- peptide-language-model
|
| 12 |
pipeline_tag: fill-mask
|
| 13 |
widget:
|
| 14 |
-
- text: "PEPTIDE1{A.
|
| 15 |
---
|
| 16 |
|
| 17 |
# HELM-BERT
|
|
@@ -20,12 +20,12 @@ A language model for peptide representation learning using **HELM (Hierarchical
|
|
| 20 |
|
| 21 |
## Model Description
|
| 22 |
|
| 23 |
-
HELM-BERT is
|
| 24 |
|
| 25 |
-
- **Disentangled Attention**:
|
| 26 |
-
- **Enhanced Mask Decoder (EMD)**:
|
| 27 |
-
- **Span Masking**: Contiguous token masking
|
| 28 |
-
- **nGiE**: n-gram Induced Encoding layer
|
| 29 |
|
| 30 |
Please check the [official repository](https://github.com/clinfo/HELM-BERT) for more implementation details and updates.
|
| 31 |
|
|
@@ -48,7 +48,8 @@ from transformers import AutoModel, AutoTokenizer
|
|
| 48 |
model = AutoModel.from_pretrained("Flansma/helm-bert", trust_remote_code=True)
|
| 49 |
tokenizer = AutoTokenizer.from_pretrained("Flansma/helm-bert", trust_remote_code=True)
|
| 50 |
|
| 51 |
-
|
|
|
|
| 52 |
outputs = model(**inputs)
|
| 53 |
embeddings = outputs.last_hidden_state
|
| 54 |
```
|
|
|
|
| 11 |
- peptide-language-model
|
| 12 |
pipeline_tag: fill-mask
|
| 13 |
widget:
|
| 14 |
+
- text: "PEPTIDE1{[Abu].[Sar].[meL].V.[meL].A.[dA].[meL].[meL].[meV].[Me_Bmt(E)]}$PEPTIDE1,PEPTIDE1,1:R1-11:R2$$$"
|
| 15 |
---
|
| 16 |
|
| 17 |
# HELM-BERT
|
|
|
|
| 20 |
|
| 21 |
## Model Description
|
| 22 |
|
| 23 |
+
HELM-BERT is built upon the DeBERTa architecture, designed for peptide sequences in HELM notation:
|
| 24 |
|
| 25 |
+
- **Disentangled Attention**: Decomposes attention into content-content and content-position terms
|
| 26 |
+
- **Enhanced Mask Decoder (EMD)**: Injects absolute position embeddings at the decoder stage
|
| 27 |
+
- **Span Masking**: Contiguous token masking with geometric distribution
|
| 28 |
+
- **nGiE**: n-gram Induced Encoding layer (1D convolution, kernel size 3)
|
| 29 |
|
| 30 |
Please check the [official repository](https://github.com/clinfo/HELM-BERT) for more implementation details and updates.
|
| 31 |
|
|
|
|
| 48 |
model = AutoModel.from_pretrained("Flansma/helm-bert", trust_remote_code=True)
|
| 49 |
tokenizer = AutoTokenizer.from_pretrained("Flansma/helm-bert", trust_remote_code=True)
|
| 50 |
|
| 51 |
+
# Cyclosporine A
|
| 52 |
+
inputs = tokenizer("PEPTIDE1{[Abu].[Sar].[meL].V.[meL].A.[dA].[meL].[meL].[meV].[Me_Bmt(E)]}$PEPTIDE1,PEPTIDE1,1:R1-11:R2$$$", return_tensors="pt")
|
| 53 |
outputs = model(**inputs)
|
| 54 |
embeddings = outputs.last_hidden_state
|
| 55 |
```
|