Opus Tatoeba | English -> Arabic

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): acm afb apc ara arq ary arz
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels: >>ara<< >>ara_Latn<< >>arq_Latn<< >>arq<< >>arz<<
  • download: opus-2021-02-23.zip
  • test set translations: opus-2021-02-23.test.txt
  • test set scores: opus-2021-02-23.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test.eng-acm 3.6 0.202 3 17 1.000
Tatoeba-test.eng-afb 29.8 0.560 36 145 1.000
Tatoeba-test.eng-apc 6.4 0.249 5 18 0.943
Tatoeba-test.eng-ara 14.0 0.437 10000 58935 1.000
Tatoeba-test.eng-arq 0.5 0.155 412 2323 1.000
Tatoeba-test.eng-ary 3.1 0.246 18 53 1.000
Tatoeba-test.eng-arz 2.1 0.249 181 856 1.000
tico19-test.eng-ara 22.2 0.530 2100 51336 0.997
Downloads last month
6
Safetensors
Model size
0.1B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support