musaw
Add validated Pashto resources across datasets models and benchmarks
fb472d7

πŸ§ͺ Benchmarks

Define fixed test sets, metrics, and leaderboard generation scripts.

πŸ“¦ Result Storage

βœ… Verified Benchmark Sources

🌸 FLEURS (Pashto speech benchmark)

πŸ“˜ Belebele (Pashto reading benchmark)

🌍 FLORES-200 (Pashto translation benchmark)

πŸ—£οΈ Common Voice Pashto v24

πŸ“ Recommended Metrics

  • ASR: WER, CER
  • TTS: MCD/objective proxies + human MOS-style scoring
  • NLP: task-specific accuracy/F1 with fixed test set
  • MT: BLEU, chrF, COMET

🧾 Reporting Template

  • Benchmark dataset + version
  • Model + checkpoint version
  • Normalization policy version
  • Metrics and error analysis summary
  • Reproducible command/config reference