tbs17
/

MathBERT

tbs17 commited on Jun 9, 2021

Commit

a9fc912

1 Parent(s): 70cde69

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -125,7 +125,7 @@ The texts are lowercased and tokenized using WordPiece and a vocabulary size of
 ```
 [CLS] Sentence A [SEP] Sentence B [SEP]
 ```
-With probability 0.5, sentence A and sentence B correspond to two consecutive sentences in the original corpus and in the other cases, it's another random sentence in the corpus. Note that what is considered a sentence here is a consecutive span of text usually longer than a single sentence. The only constrain is that the result with the two "sentences" has a combined length of less than 512 tokens.
 The details of the masking procedure for each sentence are the following:

 ```
 [CLS] Sentence A [SEP] Sentence B [SEP]
 ```
+With probability 0.5, sentence A and sentence B correspond to two consecutive sentence spans from the original corpus. Note that what is considered a sentence here is a consecutive span of text usually longer than a single sentence, but less than 512 tokens.
 The details of the masking procedure for each sentence are the following: