Opus Tatoeba | English -> Chinese

  • dataset: opus
  • model: transformer
  • source language(s): eng
  • target language(s): cjy cmn gan hak hsn lzh nan wuu yue
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels: >>cmn_Hans<< >>cmn_Hant<< >>cmn<< >>yue_Hant<< >>yue_Hans<< >>nan<< >>wuu<<
  • download: opus-2021-02-23.zip
  • test set translations: opus-2021-02-23.test.txt
  • test set scores: opus-2021-02-23.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test.eng-zho 31.6 0.267 9999 110463 0.911
Downloads last month
18
Safetensors
Model size
0.1B params
Tensor type
F16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support