Viewer
• Updated • 487k • 72
• 3
Note This dataset is mostly crawled from Sinhala Lankadeepa news papers.
Pamzyy/translated_dataset
Viewer
• Updated • 153 • 2
Note This dataset is translated by me
ihalage/sinhala-finetune-qa-eli5
Viewer
• Updated • 10k • 11
• 2
Note This is not a very good dataset seems like it has been translated
CohereLabs/aya_collection_language_split
Viewer
• Updated • 514M • 10.9k
• 119
NLPC-UOM/Sinhala-News-Category-classification
Viewer
• Updated • 3.33k • 463
• 1
NLPC-UOM/Sinhala-News-Source-classification
Viewer
• Updated • 24.1k • 213
Hamza-Ziyard/CNN-Daily-Mail-Sinhala
Viewer
• Updated • 10k • 36
• 3
Note News dataset not very good
9wimu9/sinhala_dataset_59m
Viewer
• Updated • 59.5M • 57
• 2
Note This is a raw text dataset Human Curated
9wimu9/sinhala_sentences_raw
Viewer
• Updated • 1.12k • 29
• 1
Note This dataset is mostly translated but not bad
9wimu9/sinhala_dataset_sanitized
Viewer
• Updated • 1.11k • 23
9wimu9/ada_derana_sinhala
Viewer
• Updated • 170k • 6
• 1
Suchinthana/Sinhala-QA-Translate
Viewer
• Updated • 1.02k • 40
• 2
Suchinthana/databricks-dolly-15k-sinhala
Viewer
• Updated • 15k • 8
• 2
Thimira/sinhala-llm-dataset-llama-prompt-format
Viewer
• Updated • 262k • 12
• 1