Update README.md

cdae1fd verified 10 months ago

2.93 kB

library_name: transformers
license: apache-2.0
base_model: openai/whisper-tiny
datasets:
  - PhanithLIM/ams-speech-dataset
  - openslr/openslr
  - google/fleurs
  - PhanithLIM/kh-wmc
  - PhanithLIM/wmc-international-news
  - PhanithLIM/rfi-news-dataset
  - PhanithLIM/aakanee-kh
  - rinabuoy/khm-asr-open
  - seanghay/khmer_grkpp_speech
  - seanghay/khmer_mpwt_speech
  - seanghay/km-speech-corpus
model-index:
  - name: Khmer Whisper Small PhanithLIM
    results:
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: Google Fleurs
          type: google/fleurs
          config: km_kh
          split: test
        metrics:
          - name: CER
            type: cer
            value: 22.511
      - task:
          name: Automatic Speech Recognition
          type: automatic-speech-recognition
        dataset:
          name: WMC
          type: PhanithLIM/asr-wmc-evaluate
          split: test
        metrics:
          - name: CER
            type: cer
            value: 12.581
tags:
  - generated_from_trainer
metrics:
  - wer

whisper-tiny-aug-7-may-lightning-v1

This model is a fine-tuned version of openai/whisper-tiny on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1300
Wer: 86.2590

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: constant
lr_scheduler_warmup_steps: 1000
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
1.0747	1.0	712	0.4463	102.0236
0.3496	2.0	1424	0.2607	98.4686
0.2411	3.0	2136	0.2071	92.8878
0.1966	4.0	2848	0.1819	94.1085
0.1699	5.0	3560	0.1653	92.2555
0.1514	6.0	4272	0.1533	88.5561
0.1377	7.0	4984	0.1452	88.0289
0.1265	8.0	5696	0.1391	86.8913
0.117	9.0	6408	0.1331	87.4382
0.1089	10.0	7120	0.1300	86.2590

Framework versions

Transformers 4.51.3
Pytorch 2.7.0+cu128
Datasets 3.5.1
Tokenizers 0.21.1