File size: 11,391 Bytes
f598e9f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
---
library_name: transformers
license: cc-by-nc-4.0
base_model: facebook/mms-1b-all
tags:
- generated_from_trainer
metrics:
- wer
- bleu
- rouge
model-index:
- name: MMS_10langs_simultane
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# MMS_10langs_simultane

This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.3003
- Wer: 0.3716
- Bleu: 0.4691
- Rouge1: 0.6871
- Rouge2: 0.5597
- Rougel: 0.6859

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 4
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 10
- total_train_batch_size: 40
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 100
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Wer    | Bleu   | Rouge1 | Rouge2 | Rougel |
|:-------------:|:-----:|:-----:|:---------------:|:------:|:------:|:------:|:------:|:------:|
| 2.2952        | 1.0   | 450   | 0.3890          | 0.4483 | 0.3755 | 0.6177 | 0.4654 | 0.6160 |
| 0.631         | 2.0   | 900   | 0.3822          | 0.4494 | 0.3774 | 0.6166 | 0.4639 | 0.6149 |
| 0.6019        | 3.0   | 1350  | 0.3704          | 0.4405 | 0.3805 | 0.6263 | 0.4729 | 0.6248 |
| 0.5863        | 4.0   | 1800  | 0.3545          | 0.4255 | 0.4021 | 0.6419 | 0.4948 | 0.6399 |
| 0.5718        | 5.0   | 2250  | 0.3480          | 0.4241 | 0.4044 | 0.6414 | 0.4948 | 0.6397 |
| 0.561         | 6.0   | 2700  | 0.3464          | 0.4195 | 0.4081 | 0.6465 | 0.5013 | 0.6448 |
| 0.5568        | 7.0   | 3150  | 0.3487          | 0.4214 | 0.4055 | 0.6433 | 0.4976 | 0.6414 |
| 0.547         | 8.0   | 3600  | 0.3442          | 0.4180 | 0.4101 | 0.6443 | 0.5009 | 0.6427 |
| 0.5462        | 9.0   | 4050  | 0.3395          | 0.4210 | 0.4101 | 0.6478 | 0.5042 | 0.6462 |
| 0.5354        | 10.0  | 4500  | 0.3377          | 0.4136 | 0.4141 | 0.6510 | 0.5083 | 0.6493 |
| 0.5358        | 11.0  | 4950  | 0.3405          | 0.4141 | 0.4107 | 0.6495 | 0.5059 | 0.6481 |
| 0.5295        | 12.0  | 5400  | 0.3399          | 0.4138 | 0.4167 | 0.6501 | 0.5070 | 0.6485 |
| 0.5261        | 13.0  | 5850  | 0.3351          | 0.4084 | 0.4222 | 0.6557 | 0.5144 | 0.6540 |
| 0.5238        | 14.0  | 6300  | 0.3345          | 0.4110 | 0.4199 | 0.6513 | 0.5084 | 0.6494 |
| 0.52          | 15.0  | 6750  | 0.3346          | 0.4104 | 0.4198 | 0.6511 | 0.5083 | 0.6498 |
| 0.5126        | 16.0  | 7200  | 0.3332          | 0.4127 | 0.4181 | 0.6515 | 0.5087 | 0.6498 |
| 0.5128        | 17.0  | 7650  | 0.3331          | 0.4043 | 0.4263 | 0.6597 | 0.5199 | 0.6578 |
| 0.5069        | 18.0  | 8100  | 0.3284          | 0.4024 | 0.4290 | 0.6613 | 0.5231 | 0.6597 |
| 0.5074        | 19.0  | 8550  | 0.3351          | 0.4090 | 0.4207 | 0.6565 | 0.5168 | 0.6547 |
| 0.4996        | 20.0  | 9000  | 0.3342          | 0.4035 | 0.4308 | 0.6565 | 0.5177 | 0.6544 |
| 0.4991        | 21.0  | 9450  | 0.3281          | 0.4030 | 0.4275 | 0.6678 | 0.5312 | 0.6664 |
| 0.4931        | 22.0  | 9900  | 0.3268          | 0.4075 | 0.4276 | 0.6526 | 0.5140 | 0.6508 |
| 0.4959        | 23.0  | 10350 | 0.3290          | 0.4043 | 0.4282 | 0.6594 | 0.5211 | 0.6580 |
| 0.4937        | 24.0  | 10800 | 0.3304          | 0.4115 | 0.4215 | 0.6516 | 0.5108 | 0.6495 |
| 0.4889        | 25.0  | 11250 | 0.3226          | 0.3999 | 0.4333 | 0.6605 | 0.5221 | 0.6587 |
| 0.4871        | 26.0  | 11700 | 0.3221          | 0.3974 | 0.4357 | 0.6618 | 0.5251 | 0.6604 |
| 0.4828        | 27.0  | 12150 | 0.3338          | 0.4048 | 0.4310 | 0.6510 | 0.5130 | 0.6499 |
| 0.4843        | 28.0  | 12600 | 0.3206          | 0.3974 | 0.4343 | 0.6661 | 0.5283 | 0.6642 |
| 0.4782        | 29.0  | 13050 | 0.3215          | 0.3994 | 0.4357 | 0.6603 | 0.5234 | 0.6587 |
| 0.4738        | 30.0  | 13500 | 0.3218          | 0.3964 | 0.4375 | 0.6637 | 0.5254 | 0.6618 |
| 0.4735        | 31.0  | 13950 | 0.3231          | 0.4000 | 0.4317 | 0.6632 | 0.5260 | 0.6610 |
| 0.4707        | 32.0  | 14400 | 0.3183          | 0.3917 | 0.4421 | 0.6705 | 0.5355 | 0.6688 |
| 0.4692        | 33.0  | 14850 | 0.3198          | 0.3985 | 0.4351 | 0.6663 | 0.5312 | 0.6651 |
| 0.4672        | 34.0  | 15300 | 0.3137          | 0.3932 | 0.4395 | 0.6717 | 0.5394 | 0.6699 |
| 0.4668        | 35.0  | 15750 | 0.3135          | 0.3947 | 0.4391 | 0.6676 | 0.5321 | 0.6657 |
| 0.4645        | 36.0  | 16200 | 0.3169          | 0.3958 | 0.4397 | 0.6672 | 0.5324 | 0.6652 |
| 0.4645        | 37.0  | 16650 | 0.3147          | 0.3923 | 0.4402 | 0.6694 | 0.5338 | 0.6678 |
| 0.4617        | 38.0  | 17100 | 0.3160          | 0.3924 | 0.4448 | 0.6668 | 0.5301 | 0.6647 |
| 0.4592        | 39.0  | 17550 | 0.3132          | 0.3883 | 0.4477 | 0.6736 | 0.5399 | 0.6718 |
| 0.456         | 40.0  | 18000 | 0.3108          | 0.3888 | 0.4474 | 0.6729 | 0.5391 | 0.6710 |
| 0.4562        | 41.0  | 18450 | 0.3138          | 0.3921 | 0.4435 | 0.6680 | 0.5340 | 0.6662 |
| 0.4507        | 42.0  | 18900 | 0.3137          | 0.3918 | 0.4426 | 0.6723 | 0.5385 | 0.6707 |
| 0.4521        | 43.0  | 19350 | 0.3147          | 0.3899 | 0.4479 | 0.6687 | 0.5335 | 0.6671 |
| 0.4492        | 44.0  | 19800 | 0.3121          | 0.3892 | 0.4473 | 0.6693 | 0.5353 | 0.6679 |
| 0.4481        | 45.0  | 20250 | 0.3109          | 0.3903 | 0.4474 | 0.6696 | 0.5353 | 0.6682 |
| 0.4458        | 46.0  | 20700 | 0.3146          | 0.3861 | 0.4505 | 0.6733 | 0.5397 | 0.6720 |
| 0.4469        | 47.0  | 21150 | 0.3107          | 0.3877 | 0.4495 | 0.6731 | 0.5407 | 0.6717 |
| 0.446         | 48.0  | 21600 | 0.3100          | 0.3877 | 0.4500 | 0.6742 | 0.5426 | 0.6728 |
| 0.4453        | 49.0  | 22050 | 0.3099          | 0.3885 | 0.4506 | 0.6732 | 0.5410 | 0.6715 |
| 0.4412        | 50.0  | 22500 | 0.3136          | 0.3860 | 0.4485 | 0.6779 | 0.5459 | 0.6763 |
| 0.4396        | 51.0  | 22950 | 0.3181          | 0.3879 | 0.4488 | 0.6701 | 0.5377 | 0.6688 |
| 0.4371        | 52.0  | 23400 | 0.3102          | 0.3860 | 0.4499 | 0.6772 | 0.5446 | 0.6757 |
| 0.4376        | 53.0  | 23850 | 0.3098          | 0.3884 | 0.4489 | 0.6727 | 0.5391 | 0.6704 |
| 0.4356        | 54.0  | 24300 | 0.3096          | 0.3837 | 0.4552 | 0.6731 | 0.5426 | 0.6716 |
| 0.4324        | 55.0  | 24750 | 0.3115          | 0.3832 | 0.4548 | 0.6801 | 0.5497 | 0.6787 |
| 0.4331        | 56.0  | 25200 | 0.3089          | 0.3869 | 0.4527 | 0.6756 | 0.5458 | 0.6740 |
| 0.4301        | 57.0  | 25650 | 0.3084          | 0.3848 | 0.4541 | 0.6778 | 0.5467 | 0.6763 |
| 0.4307        | 58.0  | 26100 | 0.3128          | 0.3823 | 0.4553 | 0.6759 | 0.5460 | 0.6741 |
| 0.43          | 59.0  | 26550 | 0.3070          | 0.3813 | 0.4559 | 0.6799 | 0.5502 | 0.6782 |
| 0.4244        | 60.0  | 27000 | 0.3076          | 0.3833 | 0.4539 | 0.6781 | 0.5458 | 0.6767 |
| 0.4236        | 61.0  | 27450 | 0.3109          | 0.3846 | 0.4554 | 0.6748 | 0.5449 | 0.6735 |
| 0.4257        | 62.0  | 27900 | 0.3085          | 0.3814 | 0.4544 | 0.6808 | 0.5496 | 0.6793 |
| 0.4226        | 63.0  | 28350 | 0.3068          | 0.3837 | 0.4537 | 0.6776 | 0.5454 | 0.6760 |
| 0.4239        | 64.0  | 28800 | 0.3052          | 0.3821 | 0.4561 | 0.6798 | 0.5491 | 0.6783 |
| 0.4206        | 65.0  | 29250 | 0.3095          | 0.3820 | 0.4548 | 0.6762 | 0.5457 | 0.6749 |
| 0.4212        | 66.0  | 29700 | 0.3055          | 0.3822 | 0.4541 | 0.6771 | 0.5457 | 0.6756 |
| 0.4191        | 67.0  | 30150 | 0.3063          | 0.3787 | 0.4605 | 0.6809 | 0.5520 | 0.6797 |
| 0.4137        | 68.0  | 30600 | 0.3056          | 0.3792 | 0.4577 | 0.6818 | 0.5536 | 0.6804 |
| 0.4156        | 69.0  | 31050 | 0.3023          | 0.3783 | 0.4602 | 0.6808 | 0.5507 | 0.6793 |
| 0.413         | 70.0  | 31500 | 0.3034          | 0.3785 | 0.4597 | 0.6821 | 0.5530 | 0.6803 |
| 0.4112        | 71.0  | 31950 | 0.3022          | 0.3805 | 0.4577 | 0.6804 | 0.5509 | 0.6790 |
| 0.4116        | 72.0  | 32400 | 0.3031          | 0.3793 | 0.4586 | 0.6794 | 0.5496 | 0.6782 |
| 0.4101        | 73.0  | 32850 | 0.3021          | 0.3766 | 0.4632 | 0.6819 | 0.5540 | 0.6804 |
| 0.4073        | 74.0  | 33300 | 0.3039          | 0.3788 | 0.4608 | 0.6816 | 0.5526 | 0.6805 |
| 0.4071        | 75.0  | 33750 | 0.3076          | 0.3776 | 0.4622 | 0.6823 | 0.5529 | 0.6809 |
| 0.4063        | 76.0  | 34200 | 0.3034          | 0.3776 | 0.4624 | 0.6794 | 0.5496 | 0.6783 |
| 0.407         | 77.0  | 34650 | 0.3058          | 0.3755 | 0.4637 | 0.6816 | 0.5524 | 0.6800 |
| 0.4039        | 78.0  | 35100 | 0.3048          | 0.3760 | 0.4620 | 0.6813 | 0.5510 | 0.6800 |
| 0.4052        | 79.0  | 35550 | 0.3063          | 0.3777 | 0.4620 | 0.6822 | 0.5526 | 0.6811 |
| 0.4066        | 80.0  | 36000 | 0.3029          | 0.3782 | 0.4612 | 0.6804 | 0.5489 | 0.6792 |
| 0.4036        | 81.0  | 36450 | 0.3041          | 0.3781 | 0.4603 | 0.6829 | 0.5520 | 0.6815 |
| 0.3987        | 82.0  | 36900 | 0.3048          | 0.3760 | 0.4625 | 0.6838 | 0.5549 | 0.6824 |
| 0.4007        | 83.0  | 37350 | 0.3008          | 0.3736 | 0.4659 | 0.6863 | 0.5573 | 0.6849 |
| 0.4016        | 84.0  | 37800 | 0.3011          | 0.3739 | 0.4653 | 0.6865 | 0.5586 | 0.6849 |
| 0.3981        | 85.0  | 38250 | 0.3007          | 0.3731 | 0.4666 | 0.6864 | 0.5588 | 0.6845 |
| 0.3986        | 86.0  | 38700 | 0.3005          | 0.3719 | 0.4670 | 0.6860 | 0.5583 | 0.6846 |
| 0.3955        | 87.0  | 39150 | 0.3002          | 0.3737 | 0.4656 | 0.6857 | 0.5576 | 0.6844 |
| 0.3942        | 88.0  | 39600 | 0.2999          | 0.3729 | 0.4672 | 0.6860 | 0.5596 | 0.6848 |
| 0.3951        | 89.0  | 40050 | 0.3021          | 0.3756 | 0.4646 | 0.6826 | 0.5536 | 0.6811 |
| 0.3963        | 90.0  | 40500 | 0.3000          | 0.3713 | 0.4683 | 0.6880 | 0.5624 | 0.6867 |
| 0.3941        | 91.0  | 40950 | 0.2998          | 0.3716 | 0.4677 | 0.6884 | 0.5622 | 0.6872 |
| 0.3913        | 92.0  | 41400 | 0.3018          | 0.3722 | 0.4687 | 0.6871 | 0.5600 | 0.6859 |
| 0.3938        | 93.0  | 41850 | 0.3009          | 0.3726 | 0.4687 | 0.6857 | 0.5583 | 0.6845 |
| 0.3912        | 94.0  | 42300 | 0.3003          | 0.3717 | 0.4679 | 0.6879 | 0.5617 | 0.6868 |
| 0.3921        | 95.0  | 42750 | 0.2997          | 0.3712 | 0.4693 | 0.6876 | 0.5621 | 0.6863 |
| 0.392         | 96.0  | 43200 | 0.3012          | 0.3710 | 0.4700 | 0.6884 | 0.5617 | 0.6870 |
| 0.3885        | 97.0  | 43650 | 0.3015          | 0.3713 | 0.4694 | 0.6875 | 0.5602 | 0.6861 |
| 0.3893        | 98.0  | 44100 | 0.3006          | 0.3718 | 0.4692 | 0.6868 | 0.5598 | 0.6856 |
| 0.3872        | 99.0  | 44550 | 0.3002          | 0.3712 | 0.4692 | 0.6874 | 0.5602 | 0.6859 |
| 0.3903        | 100.0 | 45000 | 0.3003          | 0.3716 | 0.4691 | 0.6871 | 0.5597 | 0.6859 |


### Framework versions

- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0