Instructions to use UBC-NLP/MARBERTv2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use UBC-NLP/MARBERTv2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("fill-mask", model="UBC-NLP/MARBERTv2")# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("UBC-NLP/MARBERTv2") model = AutoModelForMaskedLM.from_pretrained("UBC-NLP/MARBERTv2") - Inference
- Notebooks
- Google Colab
- Kaggle
elmadany commited on
Commit ·
0641745
1
Parent(s): 72c8287
commit from elmadany
Browse files
README.md
CHANGED
|
@@ -7,6 +7,7 @@ We find that results with ARBERT and MARBERT on QA are not competitive, a clear
|
|
| 7 |
To rectify this, we further pre-train the stronger model, MARBERT, on the same MSA data as ARBERT in addition to AraNews dataset but with a bigger sequence length of 512 tokens for 40 epochs. We call this
|
| 8 |
further pre-trained model **MARBERTv2**, noting it has **29B tokens**. MARBERTv2 acquires best performance on all but one test set, where XLM-RLarge marginally outperforms us (only in F1).
|
| 9 |
|
|
|
|
| 10 |
|
| 11 |
|
| 12 |
|
|
|
|
| 7 |
To rectify this, we further pre-train the stronger model, MARBERT, on the same MSA data as ARBERT in addition to AraNews dataset but with a bigger sequence length of 512 tokens for 40 epochs. We call this
|
| 8 |
further pre-trained model **MARBERTv2**, noting it has **29B tokens**. MARBERTv2 acquires best performance on all but one test set, where XLM-RLarge marginally outperforms us (only in F1).
|
| 9 |
|
| 10 |
+
For more information, please visit our own GitHub [repo](https://github.com/UBC-NLP/marbert).
|
| 11 |
|
| 12 |
|
| 13 |
|