AI & ML interests
ZamAI, led by Yaqoob Tasal (tasal9), is an open-source Pashto NLP and cultural AI ecosystem integrating cleaned bilingual datasets, scalable language models, interactive Spaces, and culturally grounded apps like IslamPaal to empower Pashto language and heritage through community collaboration and multi-platform deployment. for support click the link below
Recent Activity
ZamAI
Open AI infrastructure for Pashto and Afghan-language technology.
Mission
ZamAI builds practical AI resources for Pashto, Dari, and Afghan-language workflows: datasets, model checkpoints, evaluation assets, and usable demos. The goal is to make Afghan-language AI easier to inspect, load, improve, and deploy.
Canonical Collections
Use these collections as the clean front door. They separate primary resources from experiments and older work.
- ZamAI Pashto Text Datasets
- ZamAI Pashto Speech Datasets
- ZamAI Pashto Vision OCR Datasets
- ZamAI Pashto Models
Primary Datasets
| Area | Primary repo | Status |
|---|---|---|
| Text | tasal9/ZamAi-Pashto-Datasets-V2 |
Ready for HF Dataset Viewer |
| Text training | tasal9/Pashto-Corpus-Training-Ready |
Ready with train/validation/test splits |
| Speech | tasal9/zamai-pashto-voice2voice |
Audio ready; transcripts need review |
| Vision | ZamAI-Pashto/zamai-pashto-vision |
Viewer-ready metadata |
| OCR/Documents | ZamAI-Pashto/zamai-pashto-documents |
Viewer-ready metadata |
| Textbooks/PDFs | tasal9/Pashto-Textbooks-PDFs-Corpus |
HF preview manifest added |
Model Status
The model collection separates loadable checkpoints from metadata-only or incomplete checkpoint repos.
Loadable / usable checkpoints
tasal9/pashto-base-bloomtasal9/ZamAI-LIama3-Pashtotasal9/Multilingual-ZamAI-Embeddingstasal9/ZamAI-Phi-3-Mini-Pashtotasal9/ZamAI-Sentiment-Pashtotasal9/ZamAI-mT5-Pashto
Experimental / incomplete
These repos are kept public for transparency, planning, and reproducibility notes, but their cards clearly state if weights or tokenizer files are missing:
tasal9/ZamAI-Mistral-7B-Pashtotasal9/zamai-qwen2.5-7B-instructiontasal9/ZamAI-Whisper-v3-Pashtotasal9/ZamAI-Pashto-Translator-FacebookNLB-ps-entasal9/ZamAI-QA-Pashto
Spaces
The public Spaces are demos and workbenches for Pashto AI workflows. Current reviewed Spaces have valid cards; build issues are fixed where possible. Some Spaces may sleep or pause to control compute cost.
Contribution Priorities
The highest-impact work right now:
- Correct and validate Pashto speech transcripts.
- Upload missing model weight/tokenizer artifacts for incomplete repos.
- Add lightweight evaluations for primary models.
- Keep canonical collections up to date as repos mature.
Contact
- Website: https://zamai.dev
- Hugging Face: https://huggingface.co/ZamAI-Pashto
- Founder: Yaqoob Tasal
ZamAI: practical AI infrastructure for Afghan languages.