iteboshi-tiny

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9326
  • Wer: 115.2570
  • Cer: 50.1238

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • training_steps: 20000

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.5915 1.1013 1000 0.7334 158.2838 65.0516
0.4754 2.2026 2000 0.6800 176.7751 66.8021
0.3484 3.3040 3000 0.6674 250.1933 86.5160
0.3012 4.4053 4000 0.6733 390.6648 143.7552
0.2416 5.5066 5000 0.6857 259.8491 89.0706
0.194 6.6079 6000 0.7101 197.0769 75.6325
0.1436 7.7093 7000 0.7327 235.4833 103.3691
0.135 8.8106 8000 0.7635 223.1306 96.6303
0.0854 9.9119 9000 0.7848 235.6624 96.6693
0.062 11.0132 10000 0.8102 199.8114 83.9929
0.0299 12.1145 11000 0.8364 177.0486 102.8057
0.0254 13.2159 12000 0.8552 176.0868 85.5468
0.0196 14.3172 13000 0.8671 126.2801 60.4427
0.0136 15.4185 14000 0.8813 177.9727 73.2561
0.0102 16.5198 15000 0.8930 142.6968 57.3544
0.0079 17.6211 16000 0.9064 132.6167 59.8736
0.0074 18.7225 17000 0.9160 125.6011 55.9026
0.0053 19.8238 18000 0.9245 116.0113 50.1628
0.0052 20.9251 19000 0.9299 115.0872 47.7766
0.0043 22.0264 20000 0.9326 115.2570 50.1238

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.7.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
-
Safetensors
Model size
57.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support