Commit
·
9a1c754
1
Parent(s):
549b3a0
Add README.md
Browse files
README.md
CHANGED
|
@@ -8,8 +8,8 @@ datasets:
|
|
| 8 |
- bookcorpus
|
| 9 |
- wikipedia
|
| 10 |
---
|
| 11 |
-
# MultiBERTs Seed
|
| 12 |
-
Seed
|
| 13 |
[this paper](https://arxiv.org/pdf/2106.16163.pdf) and first released in
|
| 14 |
[this repository](https://github.com/google-research/language/tree/master/language/multiberts). This model is uncased: it does not make a difference
|
| 15 |
between english and English.
|
|
@@ -42,7 +42,7 @@ generation you should look at model like GPT2.
|
|
| 42 |
Here is how to use this model to get the features of a given text in PyTorch:
|
| 43 |
```python
|
| 44 |
from transformers import BertTokenizer, BertModel
|
| 45 |
-
tokenizer = BertTokenizer.from_pretrained('multiberts-seed
|
| 46 |
model = BertModel.from_pretrained("bert-base-uncased")
|
| 47 |
text = "Replace me by any text you'd like."
|
| 48 |
encoded_input = tokenizer(text, return_tensors='pt')
|
|
|
|
| 8 |
- bookcorpus
|
| 9 |
- wikipedia
|
| 10 |
---
|
| 11 |
+
# MultiBERTs Seed 0 (uncased)
|
| 12 |
+
Seed 0 pretrained BERT model on English language using a masked language modeling (MLM) objective. It was introduced in
|
| 13 |
[this paper](https://arxiv.org/pdf/2106.16163.pdf) and first released in
|
| 14 |
[this repository](https://github.com/google-research/language/tree/master/language/multiberts). This model is uncased: it does not make a difference
|
| 15 |
between english and English.
|
|
|
|
| 42 |
Here is how to use this model to get the features of a given text in PyTorch:
|
| 43 |
```python
|
| 44 |
from transformers import BertTokenizer, BertModel
|
| 45 |
+
tokenizer = BertTokenizer.from_pretrained('multiberts-seed-'0'')
|
| 46 |
model = BertModel.from_pretrained("bert-base-uncased")
|
| 47 |
text = "Replace me by any text you'd like."
|
| 48 |
encoded_input = tokenizer(text, return_tensors='pt')
|