ctaguchi commited on
Commit
fed9ae2
·
verified ·
1 Parent(s): 98f7733

Model save

Browse files
Files changed (2) hide show
  1. README.md +65 -239
  2. model.safetensors +1 -1
README.md CHANGED
@@ -18,9 +18,9 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.3854
22
- - Cer: 0.1012
23
- - Wer: 0.5053
24
 
25
  ## Model description
26
 
@@ -40,14 +40,14 @@ More information needed
40
 
41
  The following hyperparameters were used during training:
42
  - learning_rate: 0.0003
43
- - train_batch_size: 4
44
- - eval_batch_size: 8
45
  - seed: 42
46
  - gradient_accumulation_steps: 2
47
- - total_train_batch_size: 8
48
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
  - lr_scheduler_type: linear
50
- - lr_scheduler_warmup_steps: 500
51
  - num_epochs: 10
52
  - mixed_precision_training: Native AMP
53
 
@@ -55,238 +55,64 @@ The following hyperparameters were used during training:
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
57
  |:-------------:|:------:|:-----:|:---------------:|:------:|:------:|
58
- | 5.5496 | 0.0430 | 100 | 6.3598 | 0.9427 | 0.9974 |
59
- | 3.6224 | 0.0860 | 200 | 3.2301 | 0.9825 | 0.9987 |
60
- | 3.2092 | 0.1289 | 300 | 3.2163 | 0.9571 | 1.0 |
61
- | 2.8256 | 0.1719 | 400 | 1.6850 | 0.4259 | 0.9945 |
62
- | 1.2331 | 0.2149 | 500 | 1.2833 | 0.3602 | 0.9671 |
63
- | 1.0271 | 0.2579 | 600 | 1.0766 | 0.3244 | 0.9433 |
64
- | 0.9187 | 0.3009 | 700 | 0.9890 | 0.3029 | 0.9317 |
65
- | 0.9004 | 0.3439 | 800 | 0.9323 | 0.3054 | 0.9297 |
66
- | 0.8618 | 0.3868 | 900 | 1.0486 | 0.3017 | 0.9235 |
67
- | 0.8168 | 0.4298 | 1000 | 0.9113 | 0.2771 | 0.9026 |
68
- | 0.7806 | 0.4728 | 1100 | 0.8130 | 0.2516 | 0.8721 |
69
- | 0.7657 | 0.5158 | 1200 | 0.7714 | 0.2493 | 0.8644 |
70
- | 0.7778 | 0.5588 | 1300 | 0.7827 | 0.2512 | 0.8839 |
71
- | 0.6993 | 0.6018 | 1400 | 0.8011 | 0.2599 | 0.8817 |
72
- | 0.7114 | 0.6447 | 1500 | 0.7241 | 0.2280 | 0.8265 |
73
- | 0.7154 | 0.6877 | 1600 | 0.7387 | 0.2354 | 0.8368 |
74
- | 0.6986 | 0.7307 | 1700 | 0.7000 | 0.2241 | 0.8326 |
75
- | 0.6672 | 0.7737 | 1800 | 0.6955 | 0.2191 | 0.8312 |
76
- | 0.6544 | 0.8167 | 1900 | 0.6722 | 0.2227 | 0.8134 |
77
- | 0.6335 | 0.8597 | 2000 | 0.6987 | 0.2226 | 0.8474 |
78
- | 0.6492 | 0.9026 | 2100 | 0.7139 | 0.2387 | 0.8671 |
79
- | 0.6421 | 0.9456 | 2200 | 0.7012 | 0.2272 | 0.8338 |
80
- | 0.6471 | 0.9886 | 2300 | 0.6729 | 0.2128 | 0.8109 |
81
- | 0.5907 | 1.0314 | 2400 | 0.6128 | 0.2005 | 0.7896 |
82
- | 0.6001 | 1.0744 | 2500 | 0.6159 | 0.1875 | 0.7557 |
83
- | 0.5542 | 1.1173 | 2600 | 0.6003 | 0.1934 | 0.7769 |
84
- | 0.5961 | 1.1603 | 2700 | 0.5932 | 0.1963 | 0.7637 |
85
- | 0.5484 | 1.2033 | 2800 | 0.5692 | 0.1817 | 0.7638 |
86
- | 0.5721 | 1.2463 | 2900 | 0.6071 | 0.1872 | 0.7542 |
87
- | 0.5649 | 1.2893 | 3000 | 0.6064 | 0.1922 | 0.7630 |
88
- | 0.5371 | 1.3323 | 3100 | 0.5933 | 0.1843 | 0.7480 |
89
- | 0.5648 | 1.3752 | 3200 | 0.5738 | 0.1841 | 0.7523 |
90
- | 0.5427 | 1.4182 | 3300 | 0.5694 | 0.1915 | 0.7708 |
91
- | 0.5727 | 1.4612 | 3400 | 0.5727 | 0.1814 | 0.7460 |
92
- | 0.5369 | 1.5042 | 3500 | 0.5221 | 0.1820 | 0.7626 |
93
- | 0.5316 | 1.5472 | 3600 | 0.5710 | 0.1855 | 0.7405 |
94
- | 0.5296 | 1.5902 | 3700 | 0.5806 | 0.1818 | 0.7321 |
95
- | 0.5144 | 1.6331 | 3800 | 0.5419 | 0.1830 | 0.7441 |
96
- | 0.5311 | 1.6761 | 3900 | 0.5550 | 0.1771 | 0.7454 |
97
- | 0.5219 | 1.7191 | 4000 | 0.5444 | 0.1778 | 0.7318 |
98
- | 0.5143 | 1.7621 | 4100 | 0.5493 | 0.1780 | 0.7353 |
99
- | 0.5208 | 1.8051 | 4200 | 0.5368 | 0.1742 | 0.7363 |
100
- | 0.5409 | 1.8481 | 4300 | 0.5391 | 0.1696 | 0.7140 |
101
- | 0.5174 | 1.8910 | 4400 | 0.5264 | 0.1717 | 0.7244 |
102
- | 0.4959 | 1.9340 | 4500 | 0.5377 | 0.1766 | 0.7208 |
103
- | 0.5066 | 1.9770 | 4600 | 0.5168 | 0.1680 | 0.7147 |
104
- | 0.5118 | 2.0198 | 4700 | 0.5715 | 0.1764 | 0.7367 |
105
- | 0.4433 | 2.0628 | 4800 | 0.5688 | 0.1770 | 0.7232 |
106
- | 0.4532 | 2.1057 | 4900 | 0.5759 | 0.1794 | 0.7311 |
107
- | 0.4442 | 2.1487 | 5000 | 0.5367 | 0.1707 | 0.7191 |
108
- | 0.4448 | 2.1917 | 5100 | 0.5413 | 0.1728 | 0.7237 |
109
- | 0.4407 | 2.2347 | 5200 | 0.5299 | 0.1721 | 0.7313 |
110
- | 0.4381 | 2.2777 | 5300 | 0.5562 | 0.1806 | 0.7216 |
111
- | 0.4512 | 2.3207 | 5400 | 0.5416 | 0.1691 | 0.7013 |
112
- | 0.4592 | 2.3636 | 5500 | 0.4989 | 0.1625 | 0.6943 |
113
- | 0.432 | 2.4066 | 5600 | 0.5308 | 0.1702 | 0.7078 |
114
- | 0.4352 | 2.4496 | 5700 | 0.5401 | 0.1649 | 0.7162 |
115
- | 0.4236 | 2.4926 | 5800 | 0.5522 | 0.1767 | 0.7306 |
116
- | 0.4189 | 2.5356 | 5900 | 0.5107 | 0.1677 | 0.7050 |
117
- | 0.4372 | 2.5786 | 6000 | 0.5152 | 0.1635 | 0.6918 |
118
- | 0.4162 | 2.6215 | 6100 | 0.4924 | 0.1569 | 0.6799 |
119
- | 0.4194 | 2.6645 | 6200 | 0.5009 | 0.1657 | 0.6914 |
120
- | 0.4145 | 2.7075 | 6300 | 0.4907 | 0.1576 | 0.6827 |
121
- | 0.427 | 2.7505 | 6400 | 0.5275 | 0.1631 | 0.6918 |
122
- | 0.4086 | 2.7935 | 6500 | 0.4925 | 0.1622 | 0.6958 |
123
- | 0.4071 | 2.8364 | 6600 | 0.4922 | 0.1583 | 0.6850 |
124
- | 0.4134 | 2.8794 | 6700 | 0.4879 | 0.1565 | 0.6869 |
125
- | 0.4263 | 2.9224 | 6800 | 0.4729 | 0.1538 | 0.6723 |
126
- | 0.3952 | 2.9654 | 6900 | 0.4931 | 0.1537 | 0.6738 |
127
- | 0.3888 | 3.0082 | 7000 | 0.4710 | 0.1477 | 0.6588 |
128
- | 0.3634 | 3.0511 | 7100 | 0.4371 | 0.1445 | 0.6564 |
129
- | 0.3441 | 3.0941 | 7200 | 0.4497 | 0.1500 | 0.6664 |
130
- | 0.3588 | 3.1371 | 7300 | 0.4629 | 0.1484 | 0.6605 |
131
- | 0.349 | 3.1801 | 7400 | 0.4547 | 0.1455 | 0.6544 |
132
- | 0.3708 | 3.2231 | 7500 | 0.4557 | 0.1499 | 0.6669 |
133
- | 0.3531 | 3.2661 | 7600 | 0.4844 | 0.1484 | 0.6506 |
134
- | 0.3533 | 3.3090 | 7700 | 0.4602 | 0.1491 | 0.6559 |
135
- | 0.3549 | 3.3520 | 7800 | 0.4651 | 0.1486 | 0.6540 |
136
- | 0.3523 | 3.3950 | 7900 | 0.4517 | 0.1462 | 0.6524 |
137
- | 0.3563 | 3.4380 | 8000 | 0.4568 | 0.1494 | 0.6541 |
138
- | 0.3585 | 3.4810 | 8100 | 0.4487 | 0.1490 | 0.6595 |
139
- | 0.3678 | 3.5240 | 8200 | 0.4416 | 0.1422 | 0.6346 |
140
- | 0.365 | 3.5669 | 8300 | 0.4595 | 0.1471 | 0.6530 |
141
- | 0.3672 | 3.6099 | 8400 | 0.4358 | 0.1423 | 0.6316 |
142
- | 0.3462 | 3.6529 | 8500 | 0.4378 | 0.1461 | 0.6414 |
143
- | 0.3769 | 3.6959 | 8600 | 0.4617 | 0.1493 | 0.6472 |
144
- | 0.3571 | 3.7389 | 8700 | 0.4403 | 0.1452 | 0.6377 |
145
- | 0.3457 | 3.7819 | 8800 | 0.4271 | 0.1407 | 0.6313 |
146
- | 0.3474 | 3.8248 | 8900 | 0.4280 | 0.1394 | 0.6232 |
147
- | 0.3582 | 3.8678 | 9000 | 0.4451 | 0.1440 | 0.6393 |
148
- | 0.3439 | 3.9108 | 9100 | 0.4309 | 0.1384 | 0.6247 |
149
- | 0.3408 | 3.9538 | 9200 | 0.4242 | 0.1402 | 0.6226 |
150
- | 0.3326 | 3.9968 | 9300 | 0.4273 | 0.1396 | 0.6246 |
151
- | 0.3016 | 4.0395 | 9400 | 0.4604 | 0.1446 | 0.6482 |
152
- | 0.3043 | 4.0825 | 9500 | 0.4306 | 0.1380 | 0.6228 |
153
- | 0.3082 | 4.1255 | 9600 | 0.4281 | 0.1416 | 0.6387 |
154
- | 0.3007 | 4.1685 | 9700 | 0.4570 | 0.1429 | 0.6386 |
155
- | 0.298 | 4.2115 | 9800 | 0.4263 | 0.1381 | 0.6282 |
156
- | 0.3004 | 4.2545 | 9900 | 0.4842 | 0.1444 | 0.6378 |
157
- | 0.2919 | 4.2974 | 10000 | 0.4386 | 0.1361 | 0.6211 |
158
- | 0.3049 | 4.3404 | 10100 | 0.4584 | 0.1436 | 0.6509 |
159
- | 0.3027 | 4.3834 | 10200 | 0.4373 | 0.1405 | 0.6410 |
160
- | 0.2991 | 4.4264 | 10300 | 0.4393 | 0.1409 | 0.6284 |
161
- | 0.2863 | 4.4694 | 10400 | 0.4329 | 0.1374 | 0.6178 |
162
- | 0.3033 | 4.5124 | 10500 | 0.4144 | 0.1376 | 0.6174 |
163
- | 0.305 | 4.5553 | 10600 | 0.4284 | 0.1404 | 0.6226 |
164
- | 0.2966 | 4.5983 | 10700 | 0.4212 | 0.1392 | 0.6254 |
165
- | 0.3031 | 4.6413 | 10800 | 0.4306 | 0.1364 | 0.6209 |
166
- | 0.2982 | 4.6843 | 10900 | 0.4324 | 0.1376 | 0.6308 |
167
- | 0.2901 | 4.7273 | 11000 | 0.4226 | 0.1352 | 0.6162 |
168
- | 0.2927 | 4.7703 | 11100 | 0.3942 | 0.1302 | 0.6110 |
169
- | 0.2833 | 4.8132 | 11200 | 0.3964 | 0.1296 | 0.6022 |
170
- | 0.278 | 4.8562 | 11300 | 0.4226 | 0.1342 | 0.6114 |
171
- | 0.2919 | 4.8992 | 11400 | 0.4194 | 0.1314 | 0.6069 |
172
- | 0.307 | 4.9422 | 11500 | 0.4079 | 0.1328 | 0.6110 |
173
- | 0.2831 | 4.9852 | 11600 | 0.4120 | 0.1304 | 0.5998 |
174
- | 0.2542 | 5.0279 | 11700 | 0.3995 | 0.1311 | 0.6033 |
175
- | 0.2439 | 5.0709 | 11800 | 0.4012 | 0.1290 | 0.5999 |
176
- | 0.231 | 5.1139 | 11900 | 0.4167 | 0.1321 | 0.6037 |
177
- | 0.2363 | 5.1569 | 12000 | 0.4083 | 0.1316 | 0.5957 |
178
- | 0.2441 | 5.1999 | 12100 | 0.4134 | 0.1314 | 0.6057 |
179
- | 0.2372 | 5.2429 | 12200 | 0.4077 | 0.1299 | 0.5977 |
180
- | 0.2621 | 5.2858 | 12300 | 0.4117 | 0.1315 | 0.5983 |
181
- | 0.2436 | 5.3288 | 12400 | 0.4146 | 0.1314 | 0.6064 |
182
- | 0.245 | 5.3718 | 12500 | 0.4080 | 0.1297 | 0.5936 |
183
- | 0.242 | 5.4148 | 12600 | 0.3986 | 0.1271 | 0.5971 |
184
- | 0.2427 | 5.4578 | 12700 | 0.3980 | 0.1257 | 0.5828 |
185
- | 0.2354 | 5.5008 | 12800 | 0.4076 | 0.1271 | 0.5882 |
186
- | 0.2386 | 5.5437 | 12900 | 0.4129 | 0.1297 | 0.6011 |
187
- | 0.2452 | 5.5867 | 13000 | 0.4083 | 0.1273 | 0.5902 |
188
- | 0.2446 | 5.6297 | 13100 | 0.4121 | 0.1302 | 0.6076 |
189
- | 0.2346 | 5.6727 | 13200 | 0.3906 | 0.1235 | 0.5829 |
190
- | 0.2402 | 5.7157 | 13300 | 0.3922 | 0.1254 | 0.5896 |
191
- | 0.2322 | 5.7587 | 13400 | 0.4023 | 0.1284 | 0.5958 |
192
- | 0.2501 | 5.8016 | 13500 | 0.4004 | 0.1256 | 0.5896 |
193
- | 0.256 | 5.8446 | 13600 | 0.4003 | 0.1298 | 0.5938 |
194
- | 0.2448 | 5.8876 | 13700 | 0.3964 | 0.1272 | 0.5909 |
195
- | 0.2498 | 5.9306 | 13800 | 0.3838 | 0.1249 | 0.5844 |
196
- | 0.2306 | 5.9736 | 13900 | 0.3833 | 0.1247 | 0.5841 |
197
- | 0.2185 | 6.0163 | 14000 | 0.3810 | 0.1216 | 0.5763 |
198
- | 0.1886 | 6.0593 | 14100 | 0.4003 | 0.1221 | 0.5722 |
199
- | 0.1928 | 6.1023 | 14200 | 0.3930 | 0.1220 | 0.5747 |
200
- | 0.2063 | 6.1453 | 14300 | 0.3865 | 0.1194 | 0.5664 |
201
- | 0.1926 | 6.1883 | 14400 | 0.3949 | 0.1210 | 0.5716 |
202
- | 0.2132 | 6.2312 | 14500 | 0.4062 | 0.1238 | 0.5784 |
203
- | 0.1993 | 6.2742 | 14600 | 0.3983 | 0.1221 | 0.5703 |
204
- | 0.2059 | 6.3172 | 14700 | 0.4001 | 0.1235 | 0.5740 |
205
- | 0.2004 | 6.3602 | 14800 | 0.4002 | 0.1205 | 0.5706 |
206
- | 0.1975 | 6.4032 | 14900 | 0.3898 | 0.1212 | 0.5679 |
207
- | 0.1839 | 6.4462 | 15000 | 0.3895 | 0.1170 | 0.5528 |
208
- | 0.2046 | 6.4891 | 15100 | 0.4025 | 0.1206 | 0.5647 |
209
- | 0.1967 | 6.5321 | 15200 | 0.4016 | 0.1195 | 0.5670 |
210
- | 0.1979 | 6.5751 | 15300 | 0.3940 | 0.1182 | 0.5600 |
211
- | 0.1944 | 6.6181 | 15400 | 0.3863 | 0.1183 | 0.5613 |
212
- | 0.1979 | 6.6611 | 15500 | 0.3897 | 0.1197 | 0.5589 |
213
- | 0.1911 | 6.7041 | 15600 | 0.3905 | 0.1156 | 0.5515 |
214
- | 0.2017 | 6.7470 | 15700 | 0.3779 | 0.1166 | 0.5571 |
215
- | 0.1925 | 6.7900 | 15800 | 0.3808 | 0.1183 | 0.5625 |
216
- | 0.2002 | 6.8330 | 15900 | 0.3766 | 0.1177 | 0.5562 |
217
- | 0.1922 | 6.8760 | 16000 | 0.3909 | 0.1187 | 0.5579 |
218
- | 0.197 | 6.9190 | 16100 | 0.3716 | 0.1161 | 0.5519 |
219
- | 0.2047 | 6.9620 | 16200 | 0.3779 | 0.1170 | 0.5550 |
220
- | 0.202 | 7.0047 | 16300 | 0.3857 | 0.1192 | 0.5586 |
221
- | 0.1676 | 7.0477 | 16400 | 0.3962 | 0.1194 | 0.5594 |
222
- | 0.1548 | 7.0907 | 16500 | 0.3981 | 0.1209 | 0.5686 |
223
- | 0.1703 | 7.1337 | 16600 | 0.3832 | 0.1158 | 0.5527 |
224
- | 0.1715 | 7.1767 | 16700 | 0.3784 | 0.1141 | 0.5496 |
225
- | 0.158 | 7.2196 | 16800 | 0.3849 | 0.1160 | 0.5547 |
226
- | 0.1638 | 7.2626 | 16900 | 0.3892 | 0.1156 | 0.5531 |
227
- | 0.1592 | 7.3056 | 17000 | 0.3814 | 0.1156 | 0.5484 |
228
- | 0.1619 | 7.3486 | 17100 | 0.3822 | 0.1151 | 0.5488 |
229
- | 0.1698 | 7.3916 | 17200 | 0.3677 | 0.1128 | 0.5378 |
230
- | 0.1538 | 7.4346 | 17300 | 0.3648 | 0.1125 | 0.5396 |
231
- | 0.1485 | 7.4775 | 17400 | 0.3858 | 0.1141 | 0.5412 |
232
- | 0.1463 | 7.5205 | 17500 | 0.3804 | 0.1125 | 0.5368 |
233
- | 0.1527 | 7.5635 | 17600 | 0.3751 | 0.1153 | 0.5481 |
234
- | 0.1538 | 7.6065 | 17700 | 0.3775 | 0.1119 | 0.5420 |
235
- | 0.1592 | 7.6495 | 17800 | 0.3816 | 0.1141 | 0.5455 |
236
- | 0.1588 | 7.6925 | 17900 | 0.3929 | 0.1167 | 0.5519 |
237
- | 0.1505 | 7.7354 | 18000 | 0.3779 | 0.1116 | 0.5380 |
238
- | 0.1478 | 7.7784 | 18100 | 0.3631 | 0.1103 | 0.5358 |
239
- | 0.1455 | 7.8214 | 18200 | 0.3775 | 0.1111 | 0.5380 |
240
- | 0.1468 | 7.8644 | 18300 | 0.3652 | 0.1106 | 0.5374 |
241
- | 0.1533 | 7.9074 | 18400 | 0.3684 | 0.1096 | 0.5338 |
242
- | 0.1537 | 7.9504 | 18500 | 0.3649 | 0.1114 | 0.5354 |
243
- | 0.1526 | 7.9933 | 18600 | 0.3641 | 0.1095 | 0.5304 |
244
- | 0.1236 | 8.0361 | 18700 | 0.4009 | 0.1135 | 0.5424 |
245
- | 0.1223 | 8.0791 | 18800 | 0.3958 | 0.1102 | 0.5377 |
246
- | 0.1386 | 8.1221 | 18900 | 0.3801 | 0.1088 | 0.5327 |
247
- | 0.1281 | 8.1651 | 19000 | 0.3892 | 0.1094 | 0.5355 |
248
- | 0.1324 | 8.2080 | 19100 | 0.3790 | 0.1093 | 0.5341 |
249
- | 0.1293 | 8.2510 | 19200 | 0.3810 | 0.1096 | 0.5403 |
250
- | 0.1238 | 8.2940 | 19300 | 0.3853 | 0.1088 | 0.5301 |
251
- | 0.1355 | 8.3370 | 19400 | 0.3915 | 0.1098 | 0.5322 |
252
- | 0.1222 | 8.3800 | 19500 | 0.3811 | 0.1086 | 0.5320 |
253
- | 0.1258 | 8.4230 | 19600 | 0.3920 | 0.1080 | 0.5276 |
254
- | 0.1209 | 8.4659 | 19700 | 0.3642 | 0.1068 | 0.5203 |
255
- | 0.1256 | 8.5089 | 19800 | 0.3714 | 0.1063 | 0.5231 |
256
- | 0.1213 | 8.5519 | 19900 | 0.3784 | 0.1062 | 0.5227 |
257
- | 0.1227 | 8.5949 | 20000 | 0.3655 | 0.1046 | 0.5187 |
258
- | 0.1097 | 8.6379 | 20100 | 0.3829 | 0.1055 | 0.5219 |
259
- | 0.1162 | 8.6809 | 20200 | 0.3693 | 0.1051 | 0.5225 |
260
- | 0.1173 | 8.7238 | 20300 | 0.3755 | 0.1054 | 0.5227 |
261
- | 0.1199 | 8.7668 | 20400 | 0.3675 | 0.1051 | 0.5167 |
262
- | 0.1203 | 8.8098 | 20500 | 0.3571 | 0.1039 | 0.5163 |
263
- | 0.1198 | 8.8528 | 20600 | 0.3645 | 0.1028 | 0.5091 |
264
- | 0.1215 | 8.8958 | 20700 | 0.3629 | 0.1030 | 0.5122 |
265
- | 0.1261 | 8.9387 | 20800 | 0.3519 | 0.1025 | 0.5136 |
266
- | 0.111 | 8.9817 | 20900 | 0.3633 | 0.1037 | 0.5141 |
267
- | 0.1108 | 9.0245 | 21000 | 0.3809 | 0.1033 | 0.5119 |
268
- | 0.1095 | 9.0675 | 21100 | 0.3689 | 0.1025 | 0.5094 |
269
- | 0.0993 | 9.1105 | 21200 | 0.3796 | 0.1027 | 0.5100 |
270
- | 0.1039 | 9.1534 | 21300 | 0.3741 | 0.1036 | 0.5149 |
271
- | 0.0981 | 9.1964 | 21400 | 0.3857 | 0.1031 | 0.5152 |
272
- | 0.0996 | 9.2394 | 21500 | 0.3793 | 0.1024 | 0.5126 |
273
- | 0.0991 | 9.2824 | 21600 | 0.3801 | 0.1024 | 0.5132 |
274
- | 0.0959 | 9.3254 | 21700 | 0.3819 | 0.1014 | 0.5105 |
275
- | 0.1009 | 9.3684 | 21800 | 0.3879 | 0.1023 | 0.5117 |
276
- | 0.0942 | 9.4113 | 21900 | 0.3898 | 0.1027 | 0.5127 |
277
- | 0.0908 | 9.4543 | 22000 | 0.3916 | 0.1023 | 0.5109 |
278
- | 0.0971 | 9.4973 | 22100 | 0.3891 | 0.1024 | 0.5115 |
279
- | 0.0923 | 9.5403 | 22200 | 0.3957 | 0.1023 | 0.5122 |
280
- | 0.0835 | 9.5833 | 22300 | 0.3866 | 0.1016 | 0.5092 |
281
- | 0.1 | 9.6263 | 22400 | 0.3859 | 0.1015 | 0.5067 |
282
- | 0.0945 | 9.6692 | 22500 | 0.3830 | 0.1016 | 0.5063 |
283
- | 0.0941 | 9.7122 | 22600 | 0.3809 | 0.1018 | 0.5045 |
284
- | 0.0973 | 9.7552 | 22700 | 0.3828 | 0.1012 | 0.5036 |
285
- | 0.0909 | 9.7982 | 22800 | 0.3850 | 0.1012 | 0.5071 |
286
- | 0.0901 | 9.8412 | 22900 | 0.3848 | 0.1009 | 0.5055 |
287
- | 0.0839 | 9.8842 | 23000 | 0.3854 | 0.1010 | 0.5051 |
288
- | 0.0927 | 9.9271 | 23100 | 0.3861 | 0.1013 | 0.5059 |
289
- | 0.0892 | 9.9701 | 23200 | 0.3854 | 0.1012 | 0.5053 |
290
 
291
 
292
  ### Framework versions
 
18
 
19
  This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.2559
22
+ - Cer: 0.0920
23
+ - Wer: 0.5172
24
 
25
  ## Model description
26
 
 
40
 
41
  The following hyperparameters were used during training:
42
  - learning_rate: 0.0003
43
+ - train_batch_size: 8
44
+ - eval_batch_size: 12
45
  - seed: 42
46
  - gradient_accumulation_steps: 2
47
+ - total_train_batch_size: 16
48
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
49
  - lr_scheduler_type: linear
50
+ - lr_scheduler_warmup_steps: 100
51
  - num_epochs: 10
52
  - mixed_precision_training: Native AMP
53
 
 
55
 
56
  | Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
57
  |:-------------:|:------:|:-----:|:---------------:|:------:|:------:|
58
+ | 0.8242 | 0.1719 | 200 | 0.6052 | 0.1907 | 0.8410 |
59
+ | 0.5396 | 0.3438 | 400 | 0.4636 | 0.1535 | 0.7533 |
60
+ | 0.4706 | 0.5157 | 600 | 0.4237 | 0.1411 | 0.6953 |
61
+ | 0.4313 | 0.6876 | 800 | 0.3889 | 0.1342 | 0.6967 |
62
+ | 0.399 | 0.8595 | 1000 | 0.3817 | 0.1263 | 0.6548 |
63
+ | 0.3835 | 1.0309 | 1200 | 0.3536 | 0.1204 | 0.6379 |
64
+ | 0.4002 | 1.2028 | 1400 | 0.3461 | 0.1178 | 0.6223 |
65
+ | 0.3667 | 1.3747 | 1600 | 0.3403 | 0.1168 | 0.6230 |
66
+ | 0.3641 | 1.5466 | 1800 | 0.3356 | 0.1158 | 0.6277 |
67
+ | 0.3461 | 1.7185 | 2000 | 0.3271 | 0.1127 | 0.6118 |
68
+ | 0.3539 | 1.8904 | 2200 | 0.3223 | 0.1109 | 0.6007 |
69
+ | 0.3404 | 2.0619 | 2400 | 0.3188 | 0.1093 | 0.5941 |
70
+ | 0.3285 | 2.2338 | 2600 | 0.3115 | 0.1083 | 0.5927 |
71
+ | 0.3332 | 2.4057 | 2800 | 0.3093 | 0.1075 | 0.5888 |
72
+ | 0.3276 | 2.5776 | 3000 | 0.3062 | 0.1047 | 0.5783 |
73
+ | 0.3274 | 2.7495 | 3200 | 0.3033 | 0.1045 | 0.5749 |
74
+ | 0.3137 | 2.9214 | 3400 | 0.2981 | 0.1042 | 0.5717 |
75
+ | 0.3095 | 3.0928 | 3600 | 0.3001 | 0.1050 | 0.5807 |
76
+ | 0.3146 | 3.2647 | 3800 | 0.3041 | 0.1058 | 0.5788 |
77
+ | 0.3147 | 3.4366 | 4000 | 0.2922 | 0.1039 | 0.5865 |
78
+ | 0.2873 | 3.6085 | 4200 | 0.2905 | 0.1013 | 0.5628 |
79
+ | 0.2973 | 3.7804 | 4400 | 0.2887 | 0.1014 | 0.5590 |
80
+ | 0.3028 | 3.9523 | 4600 | 0.2853 | 0.1011 | 0.5583 |
81
+ | 0.2747 | 4.1238 | 4800 | 0.2881 | 0.0983 | 0.5490 |
82
+ | 0.2928 | 4.2957 | 5000 | 0.2897 | 0.1000 | 0.5556 |
83
+ | 0.2825 | 4.4676 | 5200 | 0.2872 | 0.0982 | 0.5492 |
84
+ | 0.2861 | 4.6394 | 5400 | 0.2820 | 0.0990 | 0.5535 |
85
+ | 0.277 | 4.8113 | 5600 | 0.2831 | 0.0986 | 0.5509 |
86
+ | 0.2827 | 4.9832 | 5800 | 0.2805 | 0.0970 | 0.5434 |
87
+ | 0.2695 | 5.1547 | 6000 | 0.2758 | 0.0970 | 0.5455 |
88
+ | 0.2696 | 5.3266 | 6200 | 0.2748 | 0.0962 | 0.5396 |
89
+ | 0.2834 | 5.4985 | 6400 | 0.2716 | 0.0966 | 0.5408 |
90
+ | 0.2786 | 5.6704 | 6600 | 0.2786 | 0.0970 | 0.5362 |
91
+ | 0.2741 | 5.8423 | 6800 | 0.2693 | 0.0948 | 0.5315 |
92
+ | 0.2816 | 6.0138 | 7000 | 0.2697 | 0.0952 | 0.5330 |
93
+ | 0.2587 | 6.1856 | 7200 | 0.2682 | 0.0951 | 0.5347 |
94
+ | 0.2703 | 6.3575 | 7400 | 0.2666 | 0.0940 | 0.5304 |
95
+ | 0.2503 | 6.5294 | 7600 | 0.2671 | 0.0949 | 0.5327 |
96
+ | 0.2656 | 6.7013 | 7800 | 0.2654 | 0.0944 | 0.5284 |
97
+ | 0.2565 | 6.8732 | 8000 | 0.2668 | 0.0935 | 0.5246 |
98
+ | 0.2518 | 7.0447 | 8200 | 0.2683 | 0.0932 | 0.5262 |
99
+ | 0.2477 | 7.2166 | 8400 | 0.2666 | 0.0930 | 0.5281 |
100
+ | 0.2575 | 7.3885 | 8600 | 0.2632 | 0.0932 | 0.5227 |
101
+ | 0.2523 | 7.5604 | 8800 | 0.2640 | 0.0932 | 0.5242 |
102
+ | 0.2383 | 7.7323 | 9000 | 0.2622 | 0.0928 | 0.5207 |
103
+ | 0.2366 | 7.9042 | 9200 | 0.2629 | 0.0931 | 0.5230 |
104
+ | 0.2381 | 8.0756 | 9400 | 0.2606 | 0.0926 | 0.5198 |
105
+ | 0.24 | 8.2475 | 9600 | 0.2609 | 0.0921 | 0.5171 |
106
+ | 0.2408 | 8.4194 | 9800 | 0.2590 | 0.0923 | 0.5185 |
107
+ | 0.2443 | 8.5913 | 10000 | 0.2575 | 0.0916 | 0.5171 |
108
+ | 0.251 | 8.7632 | 10200 | 0.2579 | 0.0919 | 0.5160 |
109
+ | 0.2418 | 8.9351 | 10400 | 0.2578 | 0.0915 | 0.5156 |
110
+ | 0.2382 | 9.1066 | 10600 | 0.2570 | 0.0912 | 0.5142 |
111
+ | 0.2342 | 9.2785 | 10800 | 0.2560 | 0.0915 | 0.5159 |
112
+ | 0.2297 | 9.4504 | 11000 | 0.2568 | 0.0917 | 0.5146 |
113
+ | 0.2365 | 9.6223 | 11200 | 0.2557 | 0.0917 | 0.5163 |
114
+ | 0.2275 | 9.7942 | 11400 | 0.2565 | 0.0918 | 0.5172 |
115
+ | 0.2436 | 9.9661 | 11600 | 0.2559 | 0.0920 | 0.5172 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
116
 
117
 
118
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3e00d605bcacacbeabc2f424fde4c0ba25b35c70deb2b7c313495ea4f6eb785b
3
  size 3858978032
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:10c9540b2687bd08ece13e01de106fb434470a613aa86ef29f686ee95a306062
3
  size 3858978032