whisper-small-kamba-model

This model is a fine-tuned version of Musembi/whisper-small-kamba-model on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1528
Wer: 76.92
Cer: 25.02

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-06
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
training_steps: 400
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
1.8431	0.2439	50	1.2481	75.99	25.19
1.9287	0.4878	100	1.2040	80.08	26.06
1.8232	0.7317	150	1.1776	80.12	25.4
1.7154	0.9756	200	1.1629	79.87	25.98
1.5737	1.2195	250	1.1575	80.37	26.03
1.5427	1.4634	300	1.1532	81.0	26.61
1.5164	1.7073	350	1.1488	81.07	25.82
1.5292	1.9512	400	1.1476	81.76	26.18

Framework versions

Transformers 5.0.0
Pytorch 2.10.0+cu128
Datasets 2.21.0
Tokenizers 0.22.2

Downloads last month: 5

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for Musembi/whisper-small-kamba-model

Unable to build the model tree, the base model loops to the model itself. Learn more.