Arabic-English translation benchmark: MSA vs dialectal performance

#10
by O96a - opened

Ran a quick benchmark on OPUS-MT Arabic-English with 9 test cases across formal, technical, and dialectal inputs.

Key findings:

  • MSA (Modern Standard Arabic): Strong performance, 3โ€“14s latency
  • Technical content: Handles ML/API terminology well, preserves code-switching
  • Dialectal Arabic: Significant truncation โ€” Egyptian "ุฅุฒูŠูƒุŸ ูƒู„ู‡ ุชู…ุงู…ุŸ" reduced to "I was gonna ask you something" (missed the greeting entirely)
  • Sudanese "ูŠุง ุฒูˆู„" outputted as "Hey, Zol" with untranslated term

Latency range: 0.4s (simple) to 13.5s (technical sentences)

For production Arabic NLP pipelines, OPUS-MT works well for MSA but dialectal preprocessing would improve coverage.

Has anyone tested this against NLLB-200 for Arabic dialect coverage?

Sign up or log in to comment