yarongef
/

DistilProtBert

@@ -1,56 +1,56 @@
----
-license: mit
-language: protein
-tags:
-- protein language model
-datasets:
-- Uniref50
----
-# DistilProtBert model
-Distilled protein language of [ProtBert](https://huggingface.co/Rostlab/prot_bert).
-In addition to cross entropy and cosine teacher-student losses, DistilProtBert was pretrained on a masked language modeling (MLM) objective  and it only works with capital letter amino acids.
-# Model description
-DistilProtBert was pretrained on millions of proteins sequences.
-Few important differences between DistilProtBert model and the original ProtBert version are:
-1. The size of the model
-2. The size of the pretraining dataset
-3. Time & hardware used for pretraining
-## Intended uses & limitations
-The model could be used for protein feature extraction or to be fine-tuned on downstream tasks.
-### How to use
-The model can be used the same as ProtBert.
-## Training data
-DistilProtBert model was pretrained on [Uniref50](https://www.uniprot.org/downloads), a dataset consisting of ~43 million protein sequences (only sequences of length between 20 to 512 amino acids were used).
-# Pretraining procedure
-Preprocessing was done using ProtBert's tokenizer.
-The details of the masking procedure for each sequence followed the original Bert (as mentioned in [ProtBert](https://huggingface.co/Rostlab/prot_bert)).
-The model was pretrained on a single DGX cluster 3 epochs in total. local batch size was 16, the optimizer used was AdamW with a learning rate of 5e-5 and mixed precision settings.
-## Evaluation results
-When fine-tuned on downstream tasks, this model achieves the following results:
-| Task/Dataset | secondary structure (3-states) | Membrane  |
-|:-----:|:-----:|:-----:|
-|   CASP12  | 72 |    |
-|   TS115   | 81 |    |
-|   CB513   | 79 |    |
-|  DeepLoc  |    | 86 |
-Distinguish between:
 ### BibTeX entry and citation info

+---
+license: mit
+language: protein
+tags:
+- protein language model
+datasets:
+- Uniref50
+---
+# DistilProtBert model
+Distilled version of [ProtBert](https://huggingface.co/Rostlab/prot_bert) model.
+In addition to cross entropy and cosine teacher-student losses, DistilProtBert was pretrained on a masked language modeling (MLM) objective  and it only works with capital letter amino acids.
+# Model description
+DistilProtBert was pretrained on millions of proteins sequences.
+Few important differences between DistilProtBert model and the original ProtBert version are:
+1. The size of the model
+2. The size of the pretraining dataset
+3. Time & hardware used for pretraining
+## Intended uses & limitations
+The model could be used for protein feature extraction or to be fine-tuned on downstream tasks.
+### How to use
+The model can be used the same as ProtBert.
+## Training data
+DistilProtBert model was pretrained on [Uniref50](https://www.uniprot.org/downloads), a dataset consisting of ~43 million protein sequences (only sequences of length between 20 to 512 amino acids were used).
+# Pretraining procedure
+Preprocessing was done using ProtBert's tokenizer.
+The details of the masking procedure for each sequence followed the original Bert (as mentioned in [ProtBert](https://huggingface.co/Rostlab/prot_bert)).
+The model was pretrained on a single DGX cluster 3 epochs in total. local batch size was 16, the optimizer used was AdamW with a learning rate of 5e-5 and mixed precision settings.
+## Evaluation results
+When fine-tuned on downstream tasks, this model achieves the following results:
+| Task/Dataset | secondary structure (3-states) | Membrane  |
+|:-----:|:-----:|:-----:|
+|   CASP12  | 72 |    |
+|   TS115   | 81 |    |
+|   CB513   | 79 |    |
+|  DeepLoc  |    | 86 |
+Distinguish between:
 ### BibTeX entry and citation info