Update README.md
Browse files
README.md
CHANGED
|
@@ -35,9 +35,10 @@ After de-duplicating the data, we were left with a total of 54.5 GB of Bulgarian
|
|
| 35 |
|
| 36 |
# Benchmark performance
|
| 37 |
|
| 38 |
-
We tested
|
|
|
|
|
|
|
| 39 |
|
| 40 |
-
Scores are averages of three runs, except for COPA, for which we use 10 runs. We use the same hyperparameter settings for all models.
|
| 41 |
|
| 42 |
## Bulgarian
|
| 43 |
|
|
|
|
| 35 |
|
| 36 |
# Benchmark performance
|
| 37 |
|
| 38 |
+
We tested performance of BERTovski on benchmarks of XPOS, UPOS and NER. For Bulgarian, we used the data from the [Universal Dependencies](https://universaldependencies.org/) project. For Macedonian, we used the data sets created in the [babushka-bench](https://github.com/clarinsi/babushka-bench/) project. We also tested on a Google (Bulgarian) and human (Macedonian) translated version of the COPA data set (for details see our [Github repo](https://github.com/RikVN/COPA)). We compare performance to the strong multi-lingual models XLMR-base and XLMR-large. For details regarding the fine-tuning procedure you can checkout our [Github](https://github.com/macocu/LanguageModels).
|
| 39 |
+
|
| 40 |
+
Scores are averages of three runs, except for COPA, for which we use 10 runs. We use the same hyperparameter settings for all models for UPOS/XPOS/NER, for COPA we optimized the learning rate on the dev set.
|
| 41 |
|
|
|
|
| 42 |
|
| 43 |
## Bulgarian
|
| 44 |
|