w2v2-queyu

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the audiofolder dataset. It achieves the following results on the evaluation set:

Loss: 1.7555
Wer: 0.6719
Cer: 0.2539

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2000
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
7.646	4.7619	100	4.4300	1.0	1.0
3.5531	9.5238	200	3.4521	1.0	1.0
3.4851	14.2857	300	3.4251	1.0	1.0
3.4211	19.0476	400	3.3634	1.0	1.0
3.1829	23.8095	500	2.8579	1.0	0.8618
1.8826	28.5714	600	1.5343	0.8987	0.3956
1.1129	33.3333	700	1.3339	0.7527	0.2825
0.7753	38.0952	800	1.2277	0.6840	0.2611
0.5711	42.8571	900	1.4287	0.6936	0.2614
0.4426	47.6190	1000	1.4906	0.7069	0.2652
0.3882	52.3810	1100	1.5981	0.6948	0.2549
0.3607	57.1429	1200	1.7040	0.7274	0.2646
0.3222	61.9048	1300	1.6772	0.6731	0.2678
0.2868	66.6667	1400	1.6665	0.7069	0.2729
0.2622	71.4286	1500	1.6840	0.7045	0.2801
0.2759	76.1905	1600	1.8806	0.7346	0.2720
0.261	80.9524	1700	1.5309	0.6719	0.2553
0.2526	85.7143	1800	1.8266	0.7189	0.2809
0.2903	90.4762	1900	1.8739	0.7744	0.2914
0.2465	95.2381	2000	1.9110	0.7394	0.2902
0.1517	100.0	2100	1.7555	0.6719	0.2539

Framework versions

Transformers 4.57.3
Pytorch 2.6.0+cu124
Datasets 3.2.0
Tokenizers 0.22.1

Downloads last month: 47

Safetensors

Model size

0.3B params

Tensor type

F32

Paper for aconeil/w2v2-queyu

Quantifying the Carbon Emissions of Machine Learning

Paper • 1910.09700 • Published Oct 21, 2019 • 41