Instructions to use UCSYNLP/MyanBERTa with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use UCSYNLP/MyanBERTa with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="UCSYNLP/MyanBERTa")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("UCSYNLP/MyanBERTa") model = AutoModelForMaskedLM.from_pretrained("UCSYNLP/MyanBERTa") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -14,7 +14,7 @@ datasets:
|
|
| 14 |
## Model description
|
| 15 |
|
| 16 |
This model is a BERT based Myanmar pre-trained language model.
|
| 17 |
-
MyanBERTa
|
| 18 |
As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
|
| 19 |
|
| 20 |
```
|
|
|
|
| 14 |
## Model description
|
| 15 |
|
| 16 |
This model is a BERT based Myanmar pre-trained language model.
|
| 17 |
+
MyanBERTa was pre-trained for 528K steps on a word segmented Myanmar dataset consisting of 5,992,299 sentences (136M words).
|
| 18 |
As the tokenizer, byte-leve BPE tokenizer of 30,522 subword units which is learned after word segmentation is applied.
|
| 19 |
|
| 20 |
```
|