ssc-cgg-mms-model-mix-adapt-max2

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7049
  • Cer: 0.1372
  • Wer: 0.6029

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 1
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 2
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
0.4633 0.2261 200 0.7482 0.1501 0.6867
0.488 0.4522 400 0.7169 0.1465 0.6558
0.4615 0.6783 600 0.7037 0.1422 0.6081
0.408 0.9045 800 0.7335 0.1471 0.6367
0.4259 1.1300 1000 0.7168 0.1477 0.6399
0.4754 1.3561 1200 0.7328 0.1418 0.6101
0.4259 1.5822 1400 0.7273 0.1429 0.6249
0.4098 1.8084 1600 0.7386 0.1454 0.6378
0.442 2.0339 1800 0.7131 0.1462 0.6346
0.4213 2.2600 2000 0.7259 0.1481 0.6367
0.4142 2.4862 2200 0.7202 0.1423 0.6228
0.4241 2.7123 2400 0.7311 0.1429 0.6282
0.3955 2.9384 2600 0.7247 0.1437 0.6244
0.4146 3.1639 2800 0.7132 0.1466 0.6429
0.4079 3.3901 3000 0.7245 0.1408 0.6122
0.41 3.6162 3200 0.7120 0.1434 0.6163
0.3987 3.8423 3400 0.7154 0.1426 0.6269
0.4015 4.0678 3600 0.7122 0.1354 0.5907
0.3837 4.2940 3800 0.7201 0.1384 0.5944
0.3412 4.5201 4000 0.7029 0.1397 0.6171
0.373 4.7462 4200 0.7318 0.1414 0.6292
0.3778 4.9723 4400 0.7155 0.1391 0.6166
0.3393 5.1979 4600 0.7126 0.1393 0.6213
0.3146 5.4240 4800 0.7337 0.1407 0.6201
0.3224 5.6501 5000 0.7114 0.1391 0.6126
0.3313 5.8762 5200 0.7224 0.1388 0.6062
0.3166 6.1018 5400 0.7191 0.1394 0.6191
0.3217 6.3279 5600 0.7007 0.1370 0.6041
0.3888 6.5540 5800 0.7155 0.1356 0.5895
0.3669 6.7801 6000 0.7153 0.1396 0.6172
0.3219 7.0057 6200 0.7100 0.1375 0.6066
0.3403 7.2318 6400 0.7135 0.1392 0.6076
0.3155 7.4579 6600 0.7133 0.1389 0.6112
0.3226 7.6840 6800 0.7107 0.1382 0.6107
0.3742 7.9101 7000 0.7112 0.1379 0.6038
0.3064 8.1357 7200 0.7102 0.1378 0.6060
0.3204 8.3618 7400 0.7083 0.1371 0.6088
0.2809 8.5879 7600 0.7119 0.1382 0.6116
0.2899 8.8140 7800 0.7077 0.1379 0.6079
0.3445 9.0396 8000 0.7082 0.1376 0.6066
0.3208 9.2657 8200 0.7039 0.1382 0.6053
0.3093 9.4918 8400 0.7087 0.1372 0.6048
0.3084 9.7179 8600 0.7098 0.1370 0.6046
0.3444 9.9440 8800 0.7049 0.1372 0.6029

Framework versions

  • Transformers 4.52.1
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.4
Downloads last month
3
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support