ufal
/

xlm-roberta-malach

Model card Files Files and versions

ChrisBridges commited on Feb 22

Commit

bbab3f6

·

verified ·

1 Parent(s): 7def7c5

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -66,7 +66,7 @@ It is split into two evaluation datasets EHRI-6 (714k tokens) and EHRI-9 (877k t
 Improvements from the XLM-RoBERTa-large checkpoint.
 The 490M test set is split from the dataset used to train this model and has a greater proportion of machine translations than the 42M test set.
-Perplexity per language in the EHRI data:
 | Model             | cs (195k)  | de (356k)  | en (81k)   | fr (3.5k)  | hu (45k)   | nl (2.5k)  | pl (34k)   | sk (6k)    | yi (151k)  |
 | ----------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
 | XLM-RoBERTa-large | 3.1553     | 3.4038     | 3.0588     | 2.0579     | 2.8928     | 2.9133     | 2.5284     | 2.6245     | **4.0217** |

 Improvements from the XLM-RoBERTa-large checkpoint.
 The 490M test set is split from the dataset used to train this model and has a greater proportion of machine translations than the 42M test set.
+Perplexity per language in the EHRI data, number of tokens given in parentheses:
 | Model             | cs (195k)  | de (356k)  | en (81k)   | fr (3.5k)  | hu (45k)   | nl (2.5k)  | pl (34k)   | sk (6k)    | yi (151k)  |
 | ----------------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- | ---------- |
 | XLM-RoBERTa-large | 3.1553     | 3.4038     | 3.0588     | 2.0579     | 2.8928     | 2.9133     | 2.5284     | 2.6245     | **4.0217** |