π Pukhto/Pashto Open Language Project
Community-led open-source project to make Pashto a first-class language in AI speech and language tooling.
π Project Links
- GitHub: https://github.com/Musawer1214/Pukhto_Pashto
- Hugging Face: https://huggingface.co/Musawer14/Pukhto_Pashto
π― Core Goal
- Build open datasets, benchmarks, and models for Pashto ASR, TTS, and NLP.
- Keep work reproducible, transparent, and contribution-friendly.
- Focus on public good and broad accessibility.
π Featured External Dataset
- Common Voice Scripted Speech 24.0 - Pashto
- Source: https://datacollective.mozillafoundation.org/datasets/cmj8u3pnb00llnxxbfvxo3b14
- Project integration guide: docs/common_voice_pashto_24.md
π Contribute Through Mozilla Common Voice
- Speak: https://commonvoice.mozilla.org/ps/speak
- Write: https://commonvoice.mozilla.org/ps/write
- Listen: https://commonvoice.mozilla.org/ps/listen
- Review: https://commonvoice.mozilla.org/ps/review
π Community Resource Profiles
- Hugging Face (external Pashto resource profile): https://huggingface.co/ihanif
- Use this profile as a reference point for Pashto ASR/TTS datasets, models, and community experiments.
π Start Here
- π Purpose:
PROJECT_PURPOSE.md - π€ Contributing:
CONTRIBUTING.md - πΊοΈ Roadmap:
ROADMAP.md - ποΈ Governance:
GOVERNANCE.md - π¬ Community coordination:
community/COMMUNICATION.md
π§© Initial Workstreams
data/Pashto data collection, cleaning, metadataasr/speech-to-text baselines and experimentstts/text-to-speech baselines and experimentsbenchmarks/fixed test sets and evaluation scriptsapps/desktop/app integration references
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support