Update README.md
Browse files
README.md
CHANGED
|
@@ -19,4 +19,5 @@ This model variant places the parser network ahead of all attention blocks.
|
|
| 19 |
|
| 20 |
The model is pretrained on the BabyLM 10M dataset using a custom pretrained RobertaTokenizer (https://huggingface.co/omarmomen/babylm_tokenizer_32k).
|
| 21 |
|
|
|
|
| 22 |
https://arxiv.org/abs/2310.20589
|
|
|
|
| 19 |
|
| 20 |
The model is pretrained on the BabyLM 10M dataset using a custom pretrained RobertaTokenizer (https://huggingface.co/omarmomen/babylm_tokenizer_32k).
|
| 21 |
|
| 22 |
+
|
| 23 |
https://arxiv.org/abs/2310.20589
|