iteboshi

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8947
  • Wer: 82.3479
  • Cer: 22.6268

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 20000

Training results

Training Loss Epoch Step Validation Loss Wer Cer
1.0854 1.1013 1000 1.2534 97.5672 52.7088
0.5859 2.2026 2000 0.8996 90.9477 48.1097
0.3373 3.3040 3000 0.7766 87.7699 29.9950
0.2445 4.4053 4000 0.7662 86.6761 28.1264
0.1548 5.5066 5000 0.7709 86.6007 27.8748
0.1102 6.6079 6000 0.7889 86.3178 26.2934
0.0682 7.7093 7000 0.7991 84.4507 27.3578
0.0647 8.8106 8000 0.8132 84.6488 25.6262
0.0343 9.9119 9000 0.8282 84.8279 24.6948
0.0181 11.0132 10000 0.8396 83.8001 24.3618
0.0117 12.1145 11000 0.8592 84.1584 24.0030
0.0111 13.2159 12000 0.8610 83.8378 24.3537
0.0088 14.3172 13000 0.8743 84.0924 24.6323
0.0112 15.4185 14000 0.8769 84.1867 24.9344
0.0109 16.5198 15000 0.8774 84.6770 24.6214
0.0032 17.6211 16000 0.8810 82.6591 23.3174
0.0017 18.7225 17000 0.8870 82.9986 22.8532
0.0019 19.8238 18000 0.8900 82.5083 22.6634
0.0008 20.9251 19000 0.8924 82.4800 22.5878
0.0006 22.0264 20000 0.8947 82.3479 22.6268

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
5
Safetensors
Model size
0.5B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support