ssc-aln-model

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8647
  • Cer: 0.5679
  • Wer: 0.9768

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
2.6542 0.1027 100 2.7873 0.9127 1.0
2.7312 0.2054 200 2.6739 0.8102 0.9999
2.7166 0.3082 300 2.3973 0.8100 0.9998
2.6442 0.4109 400 2.5701 0.7773 0.9863
2.6282 0.5136 500 2.4967 0.7596 1.0
2.6793 0.6163 600 2.4465 0.8274 0.9988
2.5982 0.7191 700 2.4531 0.7070 0.9893
2.5929 0.8218 800 2.3973 0.7647 0.9991
2.6211 0.9245 900 2.3430 0.7394 0.9911
2.5614 1.0267 1000 2.2116 0.6708 0.9887
2.5421 1.1294 1100 2.1762 0.7062 0.9970
2.5272 1.2322 1200 2.1483 0.6747 0.9907
2.4457 1.3349 1300 2.1416 0.6783 0.9754
2.4582 1.4376 1400 2.1515 0.6323 0.9812
2.5182 1.5403 1500 2.1518 0.6933 0.9828
2.545 1.6430 1600 2.1046 0.6844 0.9948
2.4768 1.7458 1700 2.0930 0.6794 0.9971
2.437 1.8485 1800 2.0755 0.6974 0.9977
2.4652 1.9512 1900 2.0531 0.6387 0.9852
2.4666 2.0534 2000 2.0942 0.6326 0.9725
2.4098 2.1561 2100 2.1318 0.7399 0.9999
2.295 2.2589 2200 2.0930 0.6261 0.9975
2.3255 2.3616 2300 2.0553 0.6080 0.9830
2.3362 2.4643 2400 2.0664 0.6241 0.9820
2.324 2.5670 2500 2.0415 0.6090 0.9839
2.3254 2.6697 2600 2.0766 0.5845 0.9765
2.3232 2.7725 2700 2.0245 0.6318 0.9836
2.2821 2.8752 2800 1.9850 0.6249 0.9870
2.2661 2.9779 2900 1.9709 0.6247 0.9770
2.2066 3.0801 3000 2.0029 0.5864 0.9691
2.1706 3.1828 3100 1.9698 0.5725 0.9681
2.1382 3.2856 3200 1.9499 0.5990 0.9759
2.2142 3.3883 3300 1.9464 0.6189 0.9825
2.2512 3.4910 3400 1.9367 0.6020 0.9843
2.1671 3.5937 3500 1.9393 0.5939 0.9799
2.2047 3.6965 3600 1.9381 0.5728 0.9700
2.1303 3.7992 3700 1.9116 0.5683 0.9722
2.1517 3.9019 3800 1.9412 0.5383 0.9495
2.2205 4.0041 3900 1.8760 0.5827 0.9780
2.07 4.1068 4000 1.9216 0.5793 0.9768
2.049 4.2096 4100 1.9057 0.5595 0.9694
2.057 4.3123 4200 1.9335 0.5549 0.9664
2.0582 4.4150 4300 1.9117 0.5552 0.9675
2.0678 4.5177 4400 1.8778 0.5699 0.9767
2.0643 4.6204 4500 1.8775 0.5704 0.9778
1.9829 4.7232 4600 1.8712 0.5704 0.9780
2.0293 4.8259 4700 1.8655 0.5577 0.9695
2.0133 4.9286 4800 1.8647 0.5679 0.9768

Framework versions

  • Transformers 4.57.2
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.22.0
Downloads last month
2
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support