Opus Tatoeba | English -> Arabic

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): acm afb apc ara arq ary arz
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels: >>ara<< >>ara_Latn<< >>arq_Latn<< >>arq<< >>arz<<
  • download: opus-2021-02-23.zip
  • test set translations: opus-2021-02-23.test.txt
  • test set scores: opus-2021-02-23.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test.eng-acm 3.6 0.202 3 17 1.000
Tatoeba-test.eng-afb 29.8 0.560 36 145 1.000
Tatoeba-test.eng-apc 6.4 0.249 5 18 0.943
Tatoeba-test.eng-ara 14.0 0.437 10000 58935 1.000
Tatoeba-test.eng-arq 0.5 0.155 412 2323 1.000
Tatoeba-test.eng-ary 3.1 0.246 18 53 1.000
Tatoeba-test.eng-arz 2.1 0.249 181 856 1.000
tico19-test.eng-ara 22.2 0.530 2100 51336 0.997
Downloads last month
12
Safetensors
Model size
0.1B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support