ssc-aln-model

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.8647
Cer: 0.5679
Wer: 0.9768

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 4
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 5
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Cer	Wer
2.6542	0.1027	100	2.7873	0.9127	1.0
2.7312	0.2054	200	2.6739	0.8102	0.9999
2.7166	0.3082	300	2.3973	0.8100	0.9998
2.6442	0.4109	400	2.5701	0.7773	0.9863
2.6282	0.5136	500	2.4967	0.7596	1.0
2.6793	0.6163	600	2.4465	0.8274	0.9988
2.5982	0.7191	700	2.4531	0.7070	0.9893
2.5929	0.8218	800	2.3973	0.7647	0.9991
2.6211	0.9245	900	2.3430	0.7394	0.9911
2.5614	1.0267	1000	2.2116	0.6708	0.9887
2.5421	1.1294	1100	2.1762	0.7062	0.9970
2.5272	1.2322	1200	2.1483	0.6747	0.9907
2.4457	1.3349	1300	2.1416	0.6783	0.9754
2.4582	1.4376	1400	2.1515	0.6323	0.9812
2.5182	1.5403	1500	2.1518	0.6933	0.9828
2.545	1.6430	1600	2.1046	0.6844	0.9948
2.4768	1.7458	1700	2.0930	0.6794	0.9971
2.437	1.8485	1800	2.0755	0.6974	0.9977
2.4652	1.9512	1900	2.0531	0.6387	0.9852
2.4666	2.0534	2000	2.0942	0.6326	0.9725
2.4098	2.1561	2100	2.1318	0.7399	0.9999
2.295	2.2589	2200	2.0930	0.6261	0.9975
2.3255	2.3616	2300	2.0553	0.6080	0.9830
2.3362	2.4643	2400	2.0664	0.6241	0.9820
2.324	2.5670	2500	2.0415	0.6090	0.9839
2.3254	2.6697	2600	2.0766	0.5845	0.9765
2.3232	2.7725	2700	2.0245	0.6318	0.9836
2.2821	2.8752	2800	1.9850	0.6249	0.9870
2.2661	2.9779	2900	1.9709	0.6247	0.9770
2.2066	3.0801	3000	2.0029	0.5864	0.9691
2.1706	3.1828	3100	1.9698	0.5725	0.9681
2.1382	3.2856	3200	1.9499	0.5990	0.9759
2.2142	3.3883	3300	1.9464	0.6189	0.9825
2.2512	3.4910	3400	1.9367	0.6020	0.9843
2.1671	3.5937	3500	1.9393	0.5939	0.9799
2.2047	3.6965	3600	1.9381	0.5728	0.9700
2.1303	3.7992	3700	1.9116	0.5683	0.9722
2.1517	3.9019	3800	1.9412	0.5383	0.9495
2.2205	4.0041	3900	1.8760	0.5827	0.9780
2.07	4.1068	4000	1.9216	0.5793	0.9768
2.049	4.2096	4100	1.9057	0.5595	0.9694
2.057	4.3123	4200	1.9335	0.5549	0.9664
2.0582	4.4150	4300	1.9117	0.5552	0.9675
2.0678	4.5177	4400	1.8778	0.5699	0.9767
2.0643	4.6204	4500	1.8775	0.5704	0.9778
1.9829	4.7232	4600	1.8712	0.5704	0.9780
2.0293	4.8259	4700	1.8655	0.5577	0.9695
2.0133	4.9286	4800	1.8647	0.5679	0.9768

Framework versions

Transformers 4.57.2
Pytorch 2.9.1+cu128
Datasets 3.6.0
Tokenizers 0.22.0

Downloads last month: -

Safetensors

Model size

0.3B params

Tensor type

F32