seniruk
/

whisper-small-si-cpu

Automatic Speech Recognition

Model card Files Files and versions

seniruk commited on Oct 24, 2025

Commit

5aafc5a

·

verified ·

1 Parent(s): 384f5be

Update README.md

Files changed (1) hide show

README.md +3 -5

README.md CHANGED Viewed

@@ -15,6 +15,8 @@ metrics:
 model-index:
 - name: Sinscribe
   results: []
 ---
 # Hi, I’m Seniru Epasinghe 👋
@@ -43,11 +45,7 @@ It achieves the following results on the evaluation set:
 Can be used for Sinhala speech to text conversions. Make sure to input noise low audio to the model, to get the best outcome.
 ## Training and evaluation data
-Trained on custom dataset made by preprocessing, cleaning and combining below datasets -> Final Model ready dataset with 161296 rows
-- [Multi speaket TTS dataset - Sinhala](https://www.kaggle.com/datasets/keshan/multi-speaket-tts-dataset-sinhala)
-- [Large Sinhala ASR training dataset](https://www.kaggle.com/datasets/keshan/large-sinhala-asr-training-dataset)
-- [sinhala-tts-dataset](https://github.com/pnfo/sinhala-tts-dataset)
 Trained on above final dataset with 2 epochs on a device with below spec for 41:00:59 hours
 - 16GB RAM

 model-index:
 - name: Sinscribe
   results: []
+datasets:
+- seniruk/sinscribe-sinhala-stt
 ---
 # Hi, I’m Seniru Epasinghe 👋
 Can be used for Sinhala speech to text conversions. Make sure to input noise low audio to the model, to get the best outcome.
 ## Training and evaluation data
+Trained on the custom dataset - [seniruk/sinscribe-sinhala-stt](https://huggingface.co/datasets/seniruk/sinscribe-sinhala-stt)
 Trained on above final dataset with 2 epochs on a device with below spec for 41:00:59 hours
 - 16GB RAM