--- license: apache-2.0 language: - en - ro pipeline_tag: translation --- # Opus Tatoeba | English -> Romanian * dataset: opus * model: transformer * source language(s): eng * target language(s): mol ron * model: transformer * pre-processing: normalization + SentencePiece (spm32k,spm32k) * a sentence initial language token is required in the form of `>>id<<` (id = valid target language ID) * valid language labels: * download: [opus-2021-02-23.zip](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-ron/opus-2021-02-23.zip) * test set translations: [opus-2021-02-23.test.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-ron/opus-2021-02-23.test.txt) * test set scores: [opus-2021-02-23.eval.txt](https://object.pouta.csc.fi/Tatoeba-MT-models/eng-ron/opus-2021-02-23.eval.txt) ## Benchmarks | testset | BLEU | chr-F | #sent | #words | BP | |---------|-------|-------|-------|--------|----| | newsdev2016-enro.eng-ron | 30.7 | 0.592 | 1999 | 51566 | 1.000 | | newstest2016-enro.eng-ron | 28.4 | 0.573 | 1999 | 49094 | 1.000 | | Tatoeba-test.eng-ron | 45.0 | 0.666 | 5000 | 36851 | 0.990 |