yarongef
/

DistilProtBert

protein language model

Model card Files Files and versions

yarongef commited on Apr 13, 2022

Commit

62bf279

·

1 Parent(s): aba2585

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -17,7 +17,9 @@ In addition to cross entropy and cosine teacher-student losses, DistilProtBert w
 DistilProtBert was pretrained on millions of proteins sequences.
 Few important differences between DistilProtBert model and the original ProtBert version are:
-1. Size of the model: 230M parameters (ProtBert has 420M parameters)
 2. Size of the pretraining dataset: ~43M proteins (ProtBert was pretrained on 216M proteins)
 3. Hardware used for pretraining: five v100 32GB Nvidia GPUs (ProtBert was pretrained on 512 16GB TPUs)

 DistilProtBert was pretrained on millions of proteins sequences.
 Few important differences between DistilProtBert model and the original ProtBert version are:
+1. Size of the model:
+  - 230M parameters (ProtBert has 420M parameters)
+  - 15 hidden layers (ProtBert has 30 hidden layers)
 2. Size of the pretraining dataset: ~43M proteins (ProtBert was pretrained on 216M proteins)
 3. Hardware used for pretraining: five v100 32GB Nvidia GPUs (ProtBert was pretrained on 512 16GB TPUs)