FricoRico's picture
Update README.md
f0ee3b2 verified
metadata
license: apache-2.0
language:
  - en
  - ro
pipeline_tag: translation

Opus Tatoeba | English -> Romanian

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): mol ron
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels:
  • download: opus-2021-02-23.zip
  • test set translations: opus-2021-02-23.test.txt
  • test set scores: opus-2021-02-23.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
newsdev2016-enro.eng-ron 30.7 0.592 1999 51566 1.000
newstest2016-enro.eng-ron 28.4 0.573 1999 49094 1.000
Tatoeba-test.eng-ron 45.0 0.666 5000 36851 0.990