Baselhany's picture
Distillation
4115edd verified
|
raw
history blame
2.5 kB
metadata
library_name: transformers
language:
  - ar
license: apache-2.0
base_model: openai/whisper-base
tags:
  - generated_from_trainer
metrics:
  - wer
model-index:
  - name: Whisper base AR - BA
    results: []

Whisper base AR - BA

This model is a fine-tuned version of openai/whisper-base on the quran-ayat-speech-to-text dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1094
  • Wer: 0.3085

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
3.6308 0.9987 595 0.1030 0.2829
2.653 1.9987 1190 0.1033 0.2764
2.0618 2.9987 1785 0.1053 0.2763
1.6073 3.9987 2380 0.1087 0.3029
1.5376 4.9987 2975 0.1034 0.3090
1.2236 5.9987 3570 0.1001 0.2902
1.0811 6.9987 4165 0.1010 0.2768
1.0269 7.9987 4760 0.1003 0.3130
0.9971 8.9987 5355 0.0991 0.2864
0.9032 9.9987 5950 0.0996 0.3194
0.7539 10.9987 6545 0.1008 0.2734
0.7127 11.9987 7140 0.0985 0.2832
0.7146 12.9987 7735 0.0993 0.3050
0.6478 13.9987 8330 0.0994 0.2921
0.624 14.9987 8925 0.0997 0.2841

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.6.0
  • Tokenizers 0.21.1