ssc-bas-mms-model-mix-adapt-max3

This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1415
  • Cer: 0.0973
  • Wer: 0.3790

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 8
  • eval_batch_size: 6
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Cer Wer
0.4901 0.7055 200 0.1873 0.1106 0.4235
0.3613 1.4092 400 0.1625 0.1066 0.4147
0.3119 2.1129 600 0.1547 0.1074 0.4204
0.3175 2.8183 800 0.1467 0.1034 0.4090
0.2935 3.5220 1000 0.1487 0.1027 0.4014
0.2797 4.2257 1200 0.1433 0.1014 0.3972
0.2612 4.9312 1400 0.1384 0.1008 0.3914
0.2594 5.6349 1600 0.1383 0.1004 0.3869
0.2461 6.3386 1800 0.1391 0.1019 0.3975
0.2402 7.0423 2000 0.1411 0.1023 0.4005
0.2389 7.7478 2200 0.1395 0.0998 0.3863
0.2315 8.4515 2400 0.1358 0.1008 0.3935
0.2142 9.1552 2600 0.1372 0.0990 0.3845
0.2099 9.8607 2800 0.1393 0.0986 0.3860
0.215 10.5644 3000 0.1346 0.0994 0.3881
0.2065 11.2681 3200 0.1366 0.1002 0.3887
0.2107 11.9735 3400 0.1341 0.0990 0.3875
0.1844 12.6772 3600 0.1394 0.0984 0.3799
0.186 13.3810 3800 0.1346 0.0980 0.3820
0.1754 14.0847 4000 0.1355 0.0982 0.3808
0.1758 14.7901 4200 0.1349 0.0975 0.3790
0.1785 15.4938 4400 0.1393 0.0980 0.3796
0.1764 16.1975 4600 0.1349 0.0984 0.3817
0.1715 16.9030 4800 0.1322 0.0973 0.3778
0.1625 17.6067 5000 0.1359 0.0991 0.3866
0.1625 18.3104 5200 0.1348 0.0989 0.3838
0.1683 19.0141 5400 0.1371 0.0971 0.3790
0.1507 19.7196 5600 0.1337 0.0965 0.3769
0.1465 20.4233 5800 0.1377 0.0973 0.3793
0.1353 21.1270 6000 0.1366 0.0977 0.3775
0.1471 21.8325 6200 0.1411 0.0975 0.3799
0.1449 22.5362 6400 0.1400 0.0976 0.3820
0.1258 23.2399 6600 0.1396 0.0977 0.3778
0.1364 23.9453 6800 0.1419 0.0977 0.3763
0.1435 24.6490 7000 0.1403 0.0970 0.3763
0.1279 25.3527 7200 0.1410 0.0975 0.3760
0.1378 26.0564 7400 0.1395 0.0966 0.3742
0.1253 26.7619 7600 0.1447 0.0979 0.3784
0.1322 27.4656 7800 0.1417 0.0970 0.3793
0.1208 28.1693 8000 0.1429 0.0980 0.3799
0.1205 28.8748 8200 0.1413 0.0975 0.3787
0.1232 29.5785 8400 0.1415 0.0973 0.3790

Framework versions

  • Transformers 4.52.1
  • Pytorch 2.9.1+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.4
Downloads last month
-
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ctaguchi/ssc-bas-mms-model-mix-adapt-max3

Finetuned
(340)
this model