namphungdn134
/

whisper-base-vi

Automatic Speech Recognition

Generated from Trainer

Model card Files Files and versions

namphungdn134 commited on Apr 16, 2025

Commit

83937ae

·

1 Parent(s): 33d759a

Update README.md

Files changed (1) hide show

README.md +3 -13

README.md CHANGED Viewed

@@ -10,8 +10,6 @@ tags:
 - audio2text
 - S2T
 - STT
-datasets:
-- doof-ferb/vlsp2020_vinai_100h
 metrics:
 - wer
 model-index:
@@ -32,15 +30,7 @@ This is a fine-tuned version of [openai/whisper-base](https://huggingface.co/ope
 ## 📊 Fine-tuning Results
-- **Loss**: 0.4049
-- **Word Error Rate (WER)**: 20.3964
-| Training Loss | Epoch  | Step | Validation Loss | Wer     |
-|:-------------:|:------:|:----:|:---------------:|:-------:|
-| 0.5199        | 0.5967 | 1000 | 0.5043          | 25.5525 |
-| 0.3967        | 1.1933 | 2000 | 0.4336          | 21.2506 |
-| 0.3459        | 1.7900 | 3000 | 0.4086          | 20.7572 |
-| 0.3208        | 2.0883 | 3500 | 0.4049          | 20.3964 |
 > Evaluation was performed on a held-out test set with diverse regional accents and speaking styles.
@@ -52,8 +42,8 @@ This model works with the WhisperProcessor to pre-process audio inputs into log-
 ## 📁 Dataset
-- Total Duration: 100 hours of high-quality Vietnamese speech data
-- Sources: Public Vietnamese datasets including the [vlsp2020_vinai_100h](https://huggingface.co/doof-ferb/vlsp2020_vinai_100h) dataset
 - Format: 16kHz WAV files with corresponding text transcripts
 - Preprocessing: Audio was normalized and segmented. Transcripts were cleaned and tokenized.

 - audio2text
 - S2T
 - STT
 metrics:
 - wer
 model-index:
 ## 📊 Fine-tuning Results
+- **Word Error Rate (WER)**: 16.9148
 > Evaluation was performed on a held-out test set with diverse regional accents and speaking styles.
 ## 📁 Dataset
+- Total Duration: More 100 hours of high-quality Vietnamese speech data
+- Sources: Public Vietnamese datasets
 - Format: 16kHz WAV files with corresponding text transcripts
 - Preprocessing: Audio was normalized and segmented. Transcripts were cleaned and tokenized.