wav2vec2-large-phoneme-en

This model is a fine-tuned version of facebook/wav2vec2-large-960h-lv60-self on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0732
  • Per: 0.9989

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 2636
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Per
131.4593 0.0569 500 3.6145 1.0
79.6933 0.1138 1000 1.9180 1.0
9.1382 0.1707 1500 0.2344 0.9993
5.5855 0.2276 2000 0.1632 0.9993
4.5870 0.2845 2500 0.1306 0.9989
3.8941 0.3414 3000 0.1153 0.9989
3.5786 0.3983 3500 0.1059 0.9989
3.5314 0.4552 4000 0.0985 0.9989
3.3550 0.5121 4500 0.0958 0.9993
3.3606 0.5690 5000 0.0917 0.9989
3.2275 0.6259 5500 0.0868 0.9989
3.1021 0.6828 6000 0.0898 0.9989
3.1049 0.7397 6500 0.0884 0.9989
3.0251 0.7966 7000 0.0848 0.9989
3.1182 0.8535 7500 0.0828 0.9989
2.7849 0.9104 8000 0.0834 0.9985
2.9732 0.9673 8500 0.0801 0.9989
2.6995 1.0241 9000 0.0793 0.9985
2.7163 1.0810 9500 0.0785 0.9989
2.7352 1.1379 10000 0.0775 0.9989
2.7593 1.1948 10500 0.0785 0.9989
2.6469 1.2517 11000 0.0797 0.9989
2.6016 1.3086 11500 0.0779 0.9989
2.6684 1.3655 12000 0.0784 0.9989
2.4984 1.4224 12500 0.0767 0.9989
2.4949 1.4793 13000 0.0826 0.9985
2.6033 1.5362 13500 0.0763 0.9989
2.6078 1.5931 14000 0.0780 0.9989
2.6058 1.6500 14500 0.0759 0.9993
2.5250 1.7070 15000 0.0750 0.9985
2.4019 1.7639 15500 0.0749 0.9989
2.4181 1.8208 16000 0.0752 0.9989
2.4320 1.8777 16500 0.0777 0.9989
2.5398 1.9346 17000 0.0763 0.9989
2.6341 1.9915 17500 0.0731 0.9989
2.3394 2.0483 18000 0.0750 0.9989
2.2166 2.1052 18500 0.0780 0.9989
2.2836 2.1621 19000 0.0749 0.9989
2.3184 2.2190 19500 0.0745 0.9989
2.2957 2.2759 20000 0.0752 0.9989
2.3212 2.3328 20500 0.0776 0.9989
2.2440 2.3897 21000 0.0727 0.9985
2.2231 2.4466 21500 0.0736 0.9989
2.3125 2.5035 22000 0.0726 0.9989
2.3410 2.5604 22500 0.0715 0.9989
2.3153 2.6173 23000 0.0753 0.9993
2.2400 2.6742 23500 0.0751 0.9989
2.2412 2.7311 24000 0.0740 0.9989
2.2796 2.7880 24500 0.0731 0.9989
2.2427 2.8449 25000 0.0729 0.9989
2.2011 2.9018 25500 0.0735 0.9989
2.2331 2.9587 26000 0.0732 0.9989

Framework versions

  • Transformers 5.3.0
  • Pytorch 2.8.0+cu128
  • Datasets 4.6.1
  • Tokenizers 0.22.2
Downloads last month
12
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Peacockery/wav2vec2-large-phoneme-en

Finetuned
(10)
this model