mms-1b-all-bemgen-combined-sd-1e-0
This model is a fine-tuned version of facebook/mms-1b-all on the BEMGEN - BEM dataset. It achieves the following results on the evaluation set:
- Loss: 1.6674
- Wer: 0.4200
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.000275
- train_batch_size: 8
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 30.0
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Wer |
|---|---|---|---|---|
| 8.2742 | 0.5076 | 100 | 4.9931 | 0.9998 |
| 4.7522 | 1.0152 | 200 | 4.5563 | 1.0079 |
| 4.3117 | 1.5228 | 300 | 3.5193 | 0.9999 |
| 2.2552 | 2.0305 | 400 | 1.7639 | 0.4427 |
| 1.897 | 2.5381 | 500 | 1.7190 | 0.4245 |
| 1.8635 | 3.0457 | 600 | 1.6986 | 0.4155 |
| 1.8428 | 3.5533 | 700 | 1.6909 | 0.4437 |
| 1.8221 | 4.0609 | 800 | 1.6860 | 0.4046 |
| 1.8201 | 4.5685 | 900 | 1.6846 | 0.4118 |
| 1.8056 | 5.0761 | 1000 | 1.6819 | 0.4067 |
| 1.8023 | 5.5838 | 1100 | 1.6789 | 0.4234 |
| 1.7851 | 6.0914 | 1200 | 1.6751 | 0.4505 |
| 1.7847 | 6.5990 | 1300 | 1.6757 | 0.4221 |
| 1.7841 | 7.1066 | 1400 | 1.6742 | 0.4082 |
| 1.7895 | 7.6142 | 1500 | 1.6732 | 0.4238 |
| 1.7754 | 8.1218 | 1600 | 1.6718 | 0.4478 |
| 1.7766 | 8.6294 | 1700 | 1.6739 | 0.4180 |
| 1.7677 | 9.1371 | 1800 | 1.6700 | 0.4344 |
| 1.7688 | 9.6447 | 1900 | 1.6686 | 0.4184 |
| 1.761 | 10.1523 | 2000 | 1.6682 | 0.4188 |
| 1.7637 | 10.6599 | 2100 | 1.6680 | 0.4521 |
| 1.7631 | 11.1675 | 2200 | 1.6697 | 0.4530 |
| 1.7744 | 11.6751 | 2300 | 1.6630 | 0.4169 |
| 1.7518 | 12.1827 | 2400 | 1.6659 | 0.4193 |
| 1.7476 | 12.6904 | 2500 | 1.6618 | 0.4385 |
| 1.7591 | 13.1980 | 2600 | 1.6645 | 0.4384 |
| 1.7486 | 13.7056 | 2700 | 1.6612 | 0.4367 |
| 1.7424 | 14.2132 | 2800 | 1.6612 | 0.4379 |
| 1.7528 | 14.7208 | 2900 | 1.6627 | 0.4076 |
| 1.753 | 15.2284 | 3000 | 1.6610 | 0.4288 |
| 1.7472 | 15.7360 | 3100 | 1.6635 | 0.4311 |
| 1.7424 | 16.2437 | 3200 | 1.6627 | 0.4025 |
| 1.7416 | 16.7513 | 3300 | 1.6614 | 0.4071 |
| 1.7367 | 17.2589 | 3400 | 1.6594 | 0.4278 |
| 1.745 | 17.7665 | 3500 | 1.6609 | 0.4300 |
| 1.7374 | 18.2741 | 3600 | 1.6648 | 0.4279 |
| 1.7398 | 18.7817 | 3700 | 1.6606 | 0.4150 |
| 1.7279 | 19.2893 | 3800 | 1.6586 | 0.4459 |
| 1.74 | 19.7970 | 3900 | 1.6593 | 0.3867 |
| 1.7287 | 20.3046 | 4000 | 1.6605 | 0.4226 |
| 1.7347 | 20.8122 | 4100 | 1.6586 | 0.4363 |
| 1.727 | 21.3198 | 4200 | 1.6576 | 0.3938 |
| 1.7314 | 21.8274 | 4300 | 1.6562 | 0.4252 |
| 1.7209 | 22.3350 | 4400 | 1.6572 | 0.3905 |
| 1.7282 | 22.8426 | 4500 | 1.6576 | 0.4304 |
| 1.7323 | 23.3503 | 4600 | 1.6568 | 0.4132 |
| 1.7159 | 23.8579 | 4700 | 1.6555 | 0.4313 |
| 1.7188 | 24.3655 | 4800 | 1.6574 | 0.4013 |
| 1.7301 | 24.8731 | 4900 | 1.6568 | 0.4254 |
| 1.7197 | 25.3807 | 5000 | 1.6560 | 0.4436 |
| 1.7211 | 25.8883 | 5100 | 1.6560 | 0.4274 |
Framework versions
- Transformers 4.52.4
- Pytorch 2.9.0+cu128
- Datasets 4.4.1
- Tokenizers 0.21.4
- Downloads last month
- 3
Model tree for csikasote/mms-1b-all-bemgen-combined-sd-1e-0
Base model
facebook/mms-1b-all