mmukh commited on
Commit
4355ff7
·
1 Parent(s): c3fecd5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -6,7 +6,7 @@ language:
6
 
7
  ## Model Description
8
 
9
- SOBertLarge is a 762M parameter BERT models trained on 27 billion tokens of SO data StackOverflow answer and comment text using the Megatron Toolkit.
10
 
11
  SOBert is pre-trained with 19 GB data presented as 15 million samples where each sample contains an entire post and all its corresponding comments. We also include
12
  all code in each answer so that our model is bimodal in nature. We use a SentencePiece tokenizer trained with BytePair Encoding, which has the benefit over WordPiece of never labeling tokens as “unknown".
 
6
 
7
  ## Model Description
8
 
9
+ SOBertLarge is a 762M parameter BERT model trained on 27 billion tokens of SO data StackOverflow answer and comment text using the Megatron Toolkit.
10
 
11
  SOBert is pre-trained with 19 GB data presented as 15 million samples where each sample contains an entire post and all its corresponding comments. We also include
12
  all code in each answer so that our model is bimodal in nature. We use a SentencePiece tokenizer trained with BytePair Encoding, which has the benefit over WordPiece of never labeling tokens as “unknown".