mms-trilingual-dv-ar-en
This model is a fine-tuned version of facebook/mms-1b-all on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1676
- Wer: 0.2509
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 10
- mixed_precision_training: Native AMP
Training results
| Training Loss | Epoch | Step | Validation Loss | Model Preparation Time | Wer |
|---|---|---|---|---|---|
| 1.2154 | 0.2581 | 250 | 0.9458 | 0.0137 | 0.5295 |
| 0.9794 | 0.5163 | 500 | 0.8440 | 0.0137 | 0.5125 |
| 1.0258 | 0.7744 | 750 | 0.8450 | 0.0137 | 0.5020 |
| 0.9701 | 1.0320 | 1000 | 0.8394 | 0.0137 | 0.5188 |
| 1.0218 | 1.2901 | 1250 | 0.7713 | 0.0137 | 0.5261 |
| 0.8837 | 1.5483 | 1500 | 0.6487 | 0.0137 | 0.4753 |
| 0.6842 | 1.8064 | 1750 | 0.4759 | 0.0137 | 0.4750 |
| 0.5637 | 2.0640 | 2000 | 0.4537 | 0.0137 | 0.4721 |
| 0.5311 | 2.3221 | 2250 | 0.4081 | 0.0137 | 0.4645 |
| 0.5178 | 2.5803 | 2500 | 0.3942 | 0.0137 | 0.4582 |
| 0.5217 | 2.8384 | 2750 | 0.3773 | 0.0137 | 0.4499 |
| 0.4585 | 3.0960 | 3000 | 0.3777 | 0.0137 | 0.4349 |
| 0.4436 | 3.3542 | 3250 | 0.3533 | 0.0137 | 0.4144 |
| 0.4485 | 3.6123 | 3500 | 0.3508 | 0.0137 | 0.4231 |
| 0.4181 | 3.8704 | 3750 | 0.3480 | 0.0137 | 0.4328 |
| 0.389 | 4.1280 | 4000 | 0.3239 | 0.0137 | 0.3931 |
| 0.4048 | 4.3862 | 4250 | 0.3356 | 0.0137 | 0.4217 |
| 0.3756 | 4.6443 | 4500 | 0.3084 | 0.0137 | 0.3796 |
| 0.3721 | 4.9024 | 4750 | 0.3000 | 0.0137 | 0.3788 |
| 0.334 | 5.1600 | 5000 | 0.2935 | 0.0137 | 0.3553 |
| 0.3029 | 5.4182 | 5250 | 0.2864 | 0.0137 | 0.3482 |
| 0.3185 | 5.6763 | 5500 | 0.2754 | 0.0137 | 0.3418 |
| 0.2919 | 5.9344 | 5750 | 0.2651 | 0.0137 | 0.3330 |
| 0.2781 | 6.1920 | 6000 | 0.1975 | 0.2901 | |
| 0.2662 | 6.4502 | 6250 | 0.1923 | 0.2871 | |
| 0.2698 | 6.7083 | 6500 | 0.1861 | 0.2841 | |
| 0.282 | 6.9664 | 6750 | 0.1867 | 0.2805 | |
| 0.2528 | 7.2241 | 7000 | 0.1809 | 0.2762 | |
| 0.2579 | 7.4822 | 7250 | 0.1779 | 0.2668 | |
| 0.22 | 7.7403 | 7500 | 0.1782 | 0.2642 | |
| 0.2177 | 7.9985 | 7750 | 0.1740 | 0.2604 | |
| 0.2096 | 8.2561 | 8000 | 0.1728 | 0.2609 | |
| 0.1942 | 8.5142 | 8250 | 0.1697 | 0.2562 | |
| 0.2121 | 8.7723 | 8500 | 0.1677 | 0.2536 | |
| 0.1835 | 9.0299 | 8750 | 0.1683 | 0.2536 | |
| 0.2002 | 9.2881 | 9000 | 0.1678 | 0.2522 | |
| 0.2144 | 9.5462 | 9250 | 0.1676 | 0.2519 | |
| 0.1918 | 9.8043 | 9500 | 0.1676 | 0.2509 |
Framework versions
- Transformers 4.57.6
- Pytorch 2.9.0+cu126
- Datasets 4.0.0
- Tokenizers 0.22.2
- Downloads last month
- 173