w2v2-lmk_org_para

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the audiofolder dataset. It achieves the following results on the evaluation set:

Loss: 1.1860
Wer: 0.6934
Cer: 0.2422

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 300
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
9.5202	6.2581	100	4.3511	1.0	1.0
3.2449	12.5161	200	2.9976	1.0	1.0
3.0171	18.7742	300	2.8768	1.0	1.0
2.9274	25.0	400	2.7959	1.0	1.0
2.6596	31.2581	500	2.3278	1.0	0.9025
1.9635	37.5161	600	1.5400	0.9617	0.4646
1.4694	43.7742	700	1.3102	0.8258	0.3343
1.1617	50.0	800	1.1932	0.8188	0.3092
0.9487	56.2581	900	1.2371	0.7700	0.2833
0.7926	62.5161	1000	1.2255	0.7213	0.2589
0.6905	68.7742	1100	1.1964	0.6864	0.2567
0.6536	75.0	1200	1.1720	0.6864	0.2468
0.5637	81.2581	1300	1.1978	0.6934	0.2483
0.5283	87.5161	1400	1.1845	0.6899	0.2384
0.535	93.7742	1500	1.1884	0.6864	0.2422
0.4905	100.0	1600	1.1860	0.6934	0.2422

Framework versions

Transformers 4.57.3
Pytorch 2.6.0+cu124
Datasets 3.2.0
Tokenizers 0.22.1

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for aconeil/w2v2-lmk_org_para

Base model

facebook/wav2vec2-large-xlsr-53

Finetuned

(348)

this model

Evaluation results

Wer on audiofolder
test set self-reported

0.693