wav2vec2-large-phoneme-en

This model is a fine-tuned version of facebook/wav2vec2-large-960h-lv60-self on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0732
Per: 0.9989

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 1
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 32
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 2636
num_epochs: 3

Training results

Training Loss	Epoch	Step	Validation Loss	Per
131.4593	0.0569	500	3.6145	1.0
79.6933	0.1138	1000	1.9180	1.0
9.1382	0.1707	1500	0.2344	0.9993
5.5855	0.2276	2000	0.1632	0.9993
4.5870	0.2845	2500	0.1306	0.9989
3.8941	0.3414	3000	0.1153	0.9989
3.5786	0.3983	3500	0.1059	0.9989
3.5314	0.4552	4000	0.0985	0.9989
3.3550	0.5121	4500	0.0958	0.9993
3.3606	0.5690	5000	0.0917	0.9989
3.2275	0.6259	5500	0.0868	0.9989
3.1021	0.6828	6000	0.0898	0.9989
3.1049	0.7397	6500	0.0884	0.9989
3.0251	0.7966	7000	0.0848	0.9989
3.1182	0.8535	7500	0.0828	0.9989
2.7849	0.9104	8000	0.0834	0.9985
2.9732	0.9673	8500	0.0801	0.9989
2.6995	1.0241	9000	0.0793	0.9985
2.7163	1.0810	9500	0.0785	0.9989
2.7352	1.1379	10000	0.0775	0.9989
2.7593	1.1948	10500	0.0785	0.9989
2.6469	1.2517	11000	0.0797	0.9989
2.6016	1.3086	11500	0.0779	0.9989
2.6684	1.3655	12000	0.0784	0.9989
2.4984	1.4224	12500	0.0767	0.9989
2.4949	1.4793	13000	0.0826	0.9985
2.6033	1.5362	13500	0.0763	0.9989
2.6078	1.5931	14000	0.0780	0.9989
2.6058	1.6500	14500	0.0759	0.9993
2.5250	1.7070	15000	0.0750	0.9985
2.4019	1.7639	15500	0.0749	0.9989
2.4181	1.8208	16000	0.0752	0.9989
2.4320	1.8777	16500	0.0777	0.9989
2.5398	1.9346	17000	0.0763	0.9989
2.6341	1.9915	17500	0.0731	0.9989
2.3394	2.0483	18000	0.0750	0.9989
2.2166	2.1052	18500	0.0780	0.9989
2.2836	2.1621	19000	0.0749	0.9989
2.3184	2.2190	19500	0.0745	0.9989
2.2957	2.2759	20000	0.0752	0.9989
2.3212	2.3328	20500	0.0776	0.9989
2.2440	2.3897	21000	0.0727	0.9985
2.2231	2.4466	21500	0.0736	0.9989
2.3125	2.5035	22000	0.0726	0.9989
2.3410	2.5604	22500	0.0715	0.9989
2.3153	2.6173	23000	0.0753	0.9993
2.2400	2.6742	23500	0.0751	0.9989
2.2412	2.7311	24000	0.0740	0.9989
2.2796	2.7880	24500	0.0731	0.9989
2.2427	2.8449	25000	0.0729	0.9989
2.2011	2.9018	25500	0.0735	0.9989
2.2331	2.9587	26000	0.0732	0.9989

Framework versions

Transformers 5.3.0
Pytorch 2.8.0+cu128
Datasets 4.6.1
Tokenizers 0.22.2

Downloads last month: 12

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for Peacockery/wav2vec2-large-phoneme-en

Base model

facebook/wav2vec2-large-960h-lv60-self

Finetuned

(10)

this model