ssc-qxp-mms-model-mix-adapt-max2

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3115
  • Cer: 0.0961
  • Wer: 0.5349

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
4.2051 0.9975 200 4.1710 0.8497 1.0
2.7502 1.9925 400 2.8237 0.8696 1.0009
2.5117 2.9875 600 2.5394 0.8131 1.0037
2.3454 3.9825 800 2.3081 0.7919 1.0018
1.9502 4.9776 1000 1.7710 0.6450 0.9972
1.3568 5.9726 1200 1.1673 0.3991 0.9614
0.9912 6.9676 1400 0.8145 0.2613 0.8621
0.8234 7.9626 1600 0.7160 0.2247 0.8346
0.6972 8.9576 1800 0.6836 0.1997 0.7767
0.6133 9.9526 2000 0.5995 0.1721 0.7491
0.5654 10.9476 2200 0.5727 0.1684 0.6967
0.5331 11.9426 2400 0.5843 0.1603 0.7022
0.4988 12.9377 2600 0.5104 0.1494 0.6746
0.4952 13.9327 2800 0.4855 0.1416 0.6489
0.4607 14.9277 3000 0.4461 0.1365 0.6553
0.43 15.9227 3200 0.4764 0.1399 0.6415
0.3792 16.9177 3400 0.5336 0.1255 0.5938
0.3817 17.9127 3600 0.4101 0.1323 0.6287
0.3681 18.9077 3800 0.3829 0.1188 0.6094
0.3436 19.9027 4000 0.4131 0.1190 0.5956
0.3244 20.8978 4200 0.3923 0.1233 0.6029
0.3224 21.8928 4400 0.3739 0.1194 0.5956
0.2962 22.8878 4600 0.3631 0.1139 0.5965
0.303 23.8828 4800 0.3541 0.1123 0.5818
0.2881 24.8778 5000 0.3626 0.1132 0.5680
0.2844 25.8728 5200 0.3890 0.1185 0.5892
0.269 26.8678 5400 0.3426 0.1072 0.5763
0.2573 27.8628 5600 0.3284 0.1068 0.5653
0.2377 28.8579 5800 0.3249 0.1023 0.5524
0.2397 29.8529 6000 0.3283 0.1028 0.5515
0.2306 30.8479 6200 0.3960 0.1082 0.5680
0.23 31.8429 6400 0.3104 0.0999 0.5395
0.2207 32.8379 6600 0.3318 0.1063 0.5662
0.2102 33.8329 6800 0.3183 0.1025 0.5653
0.1997 34.8279 7000 0.3179 0.0987 0.5285
0.2022 35.8229 7200 0.3345 0.0953 0.5303
0.193 36.8180 7400 0.3143 0.0977 0.5432
0.1839 37.8130 7600 0.3121 0.0978 0.5423
0.1788 38.8080 7800 0.3107 0.0973 0.5322
0.1807 39.8030 8000 0.3115 0.0961 0.5349

Framework versions

  • Transformers 4.52.1
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.4
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ctaguchi/ssc-qxp-mms-model-mix-adapt-max2

Finetuned
(363)
this model