ssc-bas-mms-model-mix-adapt-max-longcv2

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2587
  • Cer: 0.1279
  • Wer: 0.4567

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
3.0661 0.8016 200 2.9063 0.9241 0.9982
2.6227 1.6012 400 2.5538 0.8764 0.9879
2.37 2.4008 600 2.3405 0.8121 0.9507
2.0787 3.2004 800 1.8165 0.7173 0.9465
1.4179 4.0 1000 1.1348 0.4501 0.8524
1.0596 4.8016 1200 0.7486 0.2924 0.7123
0.8973 5.6012 1400 0.7237 0.2807 0.6842
0.8 6.4008 1600 0.6260 0.2360 0.6437
0.7781 7.2004 1800 0.5515 0.2156 0.6349
0.7073 8.0 2000 0.4825 0.1989 0.5913
0.7165 8.8016 2200 0.4225 0.1764 0.5626
0.6925 9.6012 2400 0.3848 0.1649 0.5309
0.6639 10.4008 2600 0.4002 0.1678 0.5420
0.6281 11.2004 2800 0.3676 0.1580 0.5215
0.5809 12.0 3000 0.3508 0.1552 0.5094
0.5696 12.8016 3200 0.3709 0.1579 0.5218
0.5627 13.6012 3400 0.3586 0.1548 0.5121
0.52 14.4008 3600 0.3307 0.1499 0.5079
0.5261 15.2004 3800 0.3141 0.1408 0.4882
0.4907 16.0 4000 0.3317 0.1535 0.4903
0.4883 16.8016 4200 0.2957 0.1403 0.4815
0.4508 17.6012 4400 0.3082 0.1395 0.4888
0.4399 18.4008 4600 0.3386 0.1435 0.4858
0.4519 19.2004 4800 0.2933 0.1392 0.4843
0.4395 20.0 5000 0.3248 0.1452 0.4906
0.4344 20.8016 5200 0.2868 0.1363 0.4761
0.3868 21.6012 5400 0.2769 0.1319 0.4646
0.385 22.4008 5600 0.2918 0.1350 0.4710
0.377 23.2004 5800 0.2655 0.1311 0.4634
0.3728 24.0 6000 0.2750 0.1327 0.4664
0.3377 24.8016 6200 0.2808 0.1311 0.4634
0.3606 25.6012 6400 0.2683 0.1307 0.4589
0.3368 26.4008 6600 0.2666 0.1309 0.4595
0.3563 27.2004 6800 0.2600 0.1303 0.4649
0.3224 28.0 7000 0.2585 0.1277 0.4604
0.3132 28.8016 7200 0.2596 0.1270 0.4555
0.3335 29.6012 7400 0.2587 0.1279 0.4567

Framework versions

  • Transformers 4.52.1
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.4
Downloads last month
2
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ctaguchi/ssc-bas-mms-model-mix-adapt-max-longcv2

Finetuned
(367)
this model