ikema-asr-indomain

This model is a fine-tuned version of facebook/wav2vec2-xls-r-300m on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.6104
Cer: 0.3521

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 16
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Cer
11.1682	1.3916	100	3.8553	0.9903
3.9311	2.7832	200	3.8241	0.9903
3.8623	4.1678	300	3.7760	0.9903
3.7693	5.5594	400	3.6686	0.9903
3.671	6.9510	500	3.5900	0.9893
3.5618	8.3357	600	3.5169	0.9713
3.4994	9.7273	700	3.3552	0.9699
3.3323	11.1119	800	3.1385	0.9540
3.163	12.5035	900	2.9224	0.9186
2.7901	13.8951	1000	2.1802	0.7828
2.3425	15.2797	1100	1.8406	0.6529
2.0608	16.6713	1200	1.6505	0.6329
1.8813	18.0559	1300	1.4769	0.5715
1.6705	19.4476	1400	1.4793	0.5581
1.558	20.8392	1500	1.3079	0.4970
1.4213	22.2238	1600	1.3552	0.4947
1.3122	23.6154	1700	1.2368	0.4355
1.2303	25.0	1800	1.2108	0.4347
1.1152	26.3916	1900	1.2177	0.4307
1.0441	27.7832	2000	1.3236	0.4291
0.9626	29.1678	2100	1.2738	0.4157
0.8987	30.5594	2200	1.2683	0.4190
0.8367	31.9510	2300	1.2570	0.4144
0.7617	33.3357	2400	1.2331	0.3876
0.7069	34.7273	2500	1.3284	0.4037
0.6874	36.1119	2600	1.2948	0.3818
0.6615	37.5035	2700	1.2998	0.3977
0.6086	38.8951	2800	1.3369	0.3758
0.5804	40.2797	2900	1.2815	0.3838
0.548	41.6713	3000	1.3390	0.3766
0.5239	43.0559	3100	1.2572	0.3673
0.4983	44.4476	3200	1.2955	0.3671
0.4793	45.8392	3300	1.3563	0.3729
0.438	47.2238	3400	1.4153	0.3915
0.4274	48.6154	3500	1.3198	0.3663
0.4064	50.0	3600	1.4351	0.3814
0.3812	51.3916	3700	1.3514	0.3620
0.3753	52.7832	3800	1.3715	0.3492
0.3549	54.1678	3900	1.4133	0.3649
0.3262	55.5594	4000	1.4260	0.3574
0.3296	56.9510	4100	1.5134	0.3552
0.3136	58.3357	4200	1.4696	0.3587
0.3009	59.7273	4300	1.4326	0.3554
0.2764	61.1119	4400	1.4486	0.3572
0.2738	62.5035	4500	1.4463	0.3593
0.2574	63.8951	4600	1.4303	0.3583
0.2397	65.2797	4700	1.4538	0.3446
0.2474	66.6713	4800	1.4416	0.3496
0.2212	68.0559	4900	1.4766	0.3448
0.2173	69.4476	5000	1.4785	0.3496
0.2138	70.8392	5100	1.4859	0.3582
0.2037	72.2238	5200	1.5022	0.3500
0.194	73.6154	5300	1.4964	0.3490
0.1758	75.0	5400	1.5645	0.3552
0.1693	76.3916	5500	1.5215	0.3492
0.1682	77.7832	5600	1.5572	0.3436
0.1616	79.1678	5700	1.4971	0.3461
0.1625	80.5594	5800	1.5327	0.3516
0.1432	81.9510	5900	1.5595	0.3506
0.1348	83.3357	6000	1.5562	0.3483
0.137	84.7273	6100	1.5902	0.3485
0.1263	86.1119	6200	1.5853	0.3521
0.1271	87.5035	6300	1.5977	0.3488
0.123	88.8951	6400	1.6024	0.3498
0.117	90.2797	6500	1.6093	0.3535
0.1077	91.6713	6600	1.5807	0.3519
0.1072	93.0559	6700	1.5801	0.3477
0.1063	94.4476	6800	1.5894	0.3502
0.103	95.8392	6900	1.6027	0.3498
0.1032	97.2238	7000	1.6034	0.3485
0.0971	98.6154	7100	1.6104	0.3481

Framework versions

Transformers 4.51.2
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

Downloads last month: 12

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for ctaguchi/ikema-asr-indomain

Base model

facebook/wav2vec2-xls-r-300m

Finetuned

(822)

this model