Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ datasets:
|
|
| 14 |
## Model description
|
| 15 |
|
| 16 |
This model is a BERT based Myanmar pre-trained language model.
|
| 17 |
-
MyanBERTa
|
| 18 |
As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
|
| 19 |
|
| 20 |
```
|
|
|
|
| 14 |
## Model description
|
| 15 |
|
| 16 |
This model is a BERT based Myanmar pre-trained language model.
|
| 17 |
+
MyanBERTa was pre-trained for 528K steps on a word segmented Myanmar dataset consisting of 5,992,299 sentences (136M words).
|
| 18 |
As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
|
| 19 |
|
| 20 |
```
|