ZamAI

community
Activity Feed

AI & ML interests

ZamAI, led by Yaqoob Tasal (tasal9), is an open-source Pashto NLP and cultural AI ecosystem integrating cleaned bilingual datasets, scalable language models, interactive Spaces, and culturally grounded apps like IslamPaal to empower Pashto language and heritage through community collaboration and multi-platform deployment. for support click the link below

Recent Activity

Organization Card

ZamAI

Open AI infrastructure for Pashto and Afghan-language technology.

Datasets · Models · Spaces · Website


Mission

ZamAI builds practical AI resources for Pashto, Dari, and Afghan-language workflows: datasets, model checkpoints, evaluation assets, and usable demos. The goal is to make Afghan-language AI easier to inspect, load, improve, and deploy.


Canonical Collections

Use these collections as the clean front door. They separate primary resources from experiments and older work.


Primary Datasets

Area Primary repo Status
Text tasal9/ZamAi-Pashto-Datasets-V2 Ready for HF Dataset Viewer
Text training tasal9/Pashto-Corpus-Training-Ready Ready with train/validation/test splits
Speech tasal9/zamai-pashto-voice2voice Audio ready; transcripts need review
Vision ZamAI-Pashto/zamai-pashto-vision Viewer-ready metadata
OCR/Documents ZamAI-Pashto/zamai-pashto-documents Viewer-ready metadata
Textbooks/PDFs tasal9/Pashto-Textbooks-PDFs-Corpus HF preview manifest added

Model Status

The model collection separates loadable checkpoints from metadata-only or incomplete checkpoint repos.

Loadable / usable checkpoints

Experimental / incomplete

These repos are kept public for transparency, planning, and reproducibility notes, but their cards clearly state if weights or tokenizer files are missing:


Spaces

The public Spaces are demos and workbenches for Pashto AI workflows. Current reviewed Spaces have valid cards; build issues are fixed where possible. Some Spaces may sleep or pause to control compute cost.


Contribution Priorities

The highest-impact work right now:

  1. Correct and validate Pashto speech transcripts.
  2. Upload missing model weight/tokenizer artifacts for incomplete repos.
  3. Add lightweight evaluations for primary models.
  4. Keep canonical collections up to date as repos mature.

Contact

ZamAI: practical AI infrastructure for Afghan languages.

models 0

None public yet