Update README.md
Browse files
README.md
CHANGED
|
@@ -125,7 +125,7 @@ The texts are lowercased and tokenized using WordPiece and a vocabulary size of
|
|
| 125 |
```
|
| 126 |
[CLS] Sentence A [SEP] Sentence B [SEP]
|
| 127 |
```
|
| 128 |
-
With probability 0.5, sentence A and sentence B correspond to two consecutive
|
| 129 |
|
| 130 |
The details of the masking procedure for each sentence are the following:
|
| 131 |
|
|
|
|
| 125 |
```
|
| 126 |
[CLS] Sentence A [SEP] Sentence B [SEP]
|
| 127 |
```
|
| 128 |
+
With probability 0.5, sentence A and sentence B correspond to two consecutive sentence spans from the original corpus. Note that what is considered a sentence here is a consecutive span of text usually longer than a single sentence, but less than 512 tokens.
|
| 129 |
|
| 130 |
The details of the masking procedure for each sentence are the following:
|
| 131 |
|