ngusadeep/Swahili-Corpus-Dataset
Viewer
•
Updated
•
1.69M
•
32
~1.69M raw Swahili text samples from news, government, education, and legal domains, ideal for LLM pretraining and unsupervised NLP research.