yarongef commited on
Commit
62bf279
·
1 Parent(s): aba2585

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -17,7 +17,9 @@ In addition to cross entropy and cosine teacher-student losses, DistilProtBert w
17
  DistilProtBert was pretrained on millions of proteins sequences.
18
 
19
  Few important differences between DistilProtBert model and the original ProtBert version are:
20
- 1. Size of the model: 230M parameters (ProtBert has 420M parameters)
 
 
21
  2. Size of the pretraining dataset: ~43M proteins (ProtBert was pretrained on 216M proteins)
22
  3. Hardware used for pretraining: five v100 32GB Nvidia GPUs (ProtBert was pretrained on 512 16GB TPUs)
23
 
 
17
  DistilProtBert was pretrained on millions of proteins sequences.
18
 
19
  Few important differences between DistilProtBert model and the original ProtBert version are:
20
+ 1. Size of the model:
21
+ - 230M parameters (ProtBert has 420M parameters)
22
+ - 15 hidden layers (ProtBert has 30 hidden layers)
23
  2. Size of the pretraining dataset: ~43M proteins (ProtBert was pretrained on 216M proteins)
24
  3. Hardware used for pretraining: five v100 32GB Nvidia GPUs (ProtBert was pretrained on 512 16GB TPUs)
25