Model save
Browse files- README.md +65 -239
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -18,9 +18,9 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 18 |
|
| 19 |
This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on the None dataset.
|
| 20 |
It achieves the following results on the evaluation set:
|
| 21 |
-
- Loss: 0.
|
| 22 |
-
- Cer: 0.
|
| 23 |
-
- Wer: 0.
|
| 24 |
|
| 25 |
## Model description
|
| 26 |
|
|
@@ -40,14 +40,14 @@ More information needed
|
|
| 40 |
|
| 41 |
The following hyperparameters were used during training:
|
| 42 |
- learning_rate: 0.0003
|
| 43 |
-
- train_batch_size:
|
| 44 |
-
- eval_batch_size:
|
| 45 |
- seed: 42
|
| 46 |
- gradient_accumulation_steps: 2
|
| 47 |
-
- total_train_batch_size:
|
| 48 |
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 49 |
- lr_scheduler_type: linear
|
| 50 |
-
- lr_scheduler_warmup_steps:
|
| 51 |
- num_epochs: 10
|
| 52 |
- mixed_precision_training: Native AMP
|
| 53 |
|
|
@@ -55,238 +55,64 @@ The following hyperparameters were used during training:
|
|
| 55 |
|
| 56 |
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
|
| 57 |
|:-------------:|:------:|:-----:|:---------------:|:------:|:------:|
|
| 58 |
-
|
|
| 59 |
-
|
|
| 60 |
-
|
|
| 61 |
-
|
|
| 62 |
-
|
|
| 63 |
-
|
|
| 64 |
-
| 0.
|
| 65 |
-
| 0.
|
| 66 |
-
| 0.
|
| 67 |
-
| 0.
|
| 68 |
-
| 0.
|
| 69 |
-
| 0.
|
| 70 |
-
| 0.
|
| 71 |
-
| 0.
|
| 72 |
-
| 0.
|
| 73 |
-
| 0.
|
| 74 |
-
| 0.
|
| 75 |
-
| 0.
|
| 76 |
-
| 0.
|
| 77 |
-
| 0.
|
| 78 |
-
| 0.
|
| 79 |
-
| 0.
|
| 80 |
-
| 0.
|
| 81 |
-
| 0.
|
| 82 |
-
| 0.
|
| 83 |
-
| 0.
|
| 84 |
-
| 0.
|
| 85 |
-
| 0.
|
| 86 |
-
| 0.
|
| 87 |
-
| 0.
|
| 88 |
-
| 0.
|
| 89 |
-
| 0.
|
| 90 |
-
| 0.
|
| 91 |
-
| 0.
|
| 92 |
-
| 0.
|
| 93 |
-
| 0.
|
| 94 |
-
| 0.
|
| 95 |
-
| 0.
|
| 96 |
-
| 0.
|
| 97 |
-
| 0.
|
| 98 |
-
| 0.
|
| 99 |
-
| 0.
|
| 100 |
-
| 0.
|
| 101 |
-
| 0.
|
| 102 |
-
| 0.
|
| 103 |
-
| 0.
|
| 104 |
-
| 0.
|
| 105 |
-
| 0.
|
| 106 |
-
| 0.
|
| 107 |
-
| 0.
|
| 108 |
-
| 0.
|
| 109 |
-
| 0.
|
| 110 |
-
| 0.
|
| 111 |
-
| 0.
|
| 112 |
-
| 0.
|
| 113 |
-
| 0.
|
| 114 |
-
| 0.
|
| 115 |
-
| 0.
|
| 116 |
-
| 0.4189 | 2.5356 | 5900 | 0.5107 | 0.1677 | 0.7050 |
|
| 117 |
-
| 0.4372 | 2.5786 | 6000 | 0.5152 | 0.1635 | 0.6918 |
|
| 118 |
-
| 0.4162 | 2.6215 | 6100 | 0.4924 | 0.1569 | 0.6799 |
|
| 119 |
-
| 0.4194 | 2.6645 | 6200 | 0.5009 | 0.1657 | 0.6914 |
|
| 120 |
-
| 0.4145 | 2.7075 | 6300 | 0.4907 | 0.1576 | 0.6827 |
|
| 121 |
-
| 0.427 | 2.7505 | 6400 | 0.5275 | 0.1631 | 0.6918 |
|
| 122 |
-
| 0.4086 | 2.7935 | 6500 | 0.4925 | 0.1622 | 0.6958 |
|
| 123 |
-
| 0.4071 | 2.8364 | 6600 | 0.4922 | 0.1583 | 0.6850 |
|
| 124 |
-
| 0.4134 | 2.8794 | 6700 | 0.4879 | 0.1565 | 0.6869 |
|
| 125 |
-
| 0.4263 | 2.9224 | 6800 | 0.4729 | 0.1538 | 0.6723 |
|
| 126 |
-
| 0.3952 | 2.9654 | 6900 | 0.4931 | 0.1537 | 0.6738 |
|
| 127 |
-
| 0.3888 | 3.0082 | 7000 | 0.4710 | 0.1477 | 0.6588 |
|
| 128 |
-
| 0.3634 | 3.0511 | 7100 | 0.4371 | 0.1445 | 0.6564 |
|
| 129 |
-
| 0.3441 | 3.0941 | 7200 | 0.4497 | 0.1500 | 0.6664 |
|
| 130 |
-
| 0.3588 | 3.1371 | 7300 | 0.4629 | 0.1484 | 0.6605 |
|
| 131 |
-
| 0.349 | 3.1801 | 7400 | 0.4547 | 0.1455 | 0.6544 |
|
| 132 |
-
| 0.3708 | 3.2231 | 7500 | 0.4557 | 0.1499 | 0.6669 |
|
| 133 |
-
| 0.3531 | 3.2661 | 7600 | 0.4844 | 0.1484 | 0.6506 |
|
| 134 |
-
| 0.3533 | 3.3090 | 7700 | 0.4602 | 0.1491 | 0.6559 |
|
| 135 |
-
| 0.3549 | 3.3520 | 7800 | 0.4651 | 0.1486 | 0.6540 |
|
| 136 |
-
| 0.3523 | 3.3950 | 7900 | 0.4517 | 0.1462 | 0.6524 |
|
| 137 |
-
| 0.3563 | 3.4380 | 8000 | 0.4568 | 0.1494 | 0.6541 |
|
| 138 |
-
| 0.3585 | 3.4810 | 8100 | 0.4487 | 0.1490 | 0.6595 |
|
| 139 |
-
| 0.3678 | 3.5240 | 8200 | 0.4416 | 0.1422 | 0.6346 |
|
| 140 |
-
| 0.365 | 3.5669 | 8300 | 0.4595 | 0.1471 | 0.6530 |
|
| 141 |
-
| 0.3672 | 3.6099 | 8400 | 0.4358 | 0.1423 | 0.6316 |
|
| 142 |
-
| 0.3462 | 3.6529 | 8500 | 0.4378 | 0.1461 | 0.6414 |
|
| 143 |
-
| 0.3769 | 3.6959 | 8600 | 0.4617 | 0.1493 | 0.6472 |
|
| 144 |
-
| 0.3571 | 3.7389 | 8700 | 0.4403 | 0.1452 | 0.6377 |
|
| 145 |
-
| 0.3457 | 3.7819 | 8800 | 0.4271 | 0.1407 | 0.6313 |
|
| 146 |
-
| 0.3474 | 3.8248 | 8900 | 0.4280 | 0.1394 | 0.6232 |
|
| 147 |
-
| 0.3582 | 3.8678 | 9000 | 0.4451 | 0.1440 | 0.6393 |
|
| 148 |
-
| 0.3439 | 3.9108 | 9100 | 0.4309 | 0.1384 | 0.6247 |
|
| 149 |
-
| 0.3408 | 3.9538 | 9200 | 0.4242 | 0.1402 | 0.6226 |
|
| 150 |
-
| 0.3326 | 3.9968 | 9300 | 0.4273 | 0.1396 | 0.6246 |
|
| 151 |
-
| 0.3016 | 4.0395 | 9400 | 0.4604 | 0.1446 | 0.6482 |
|
| 152 |
-
| 0.3043 | 4.0825 | 9500 | 0.4306 | 0.1380 | 0.6228 |
|
| 153 |
-
| 0.3082 | 4.1255 | 9600 | 0.4281 | 0.1416 | 0.6387 |
|
| 154 |
-
| 0.3007 | 4.1685 | 9700 | 0.4570 | 0.1429 | 0.6386 |
|
| 155 |
-
| 0.298 | 4.2115 | 9800 | 0.4263 | 0.1381 | 0.6282 |
|
| 156 |
-
| 0.3004 | 4.2545 | 9900 | 0.4842 | 0.1444 | 0.6378 |
|
| 157 |
-
| 0.2919 | 4.2974 | 10000 | 0.4386 | 0.1361 | 0.6211 |
|
| 158 |
-
| 0.3049 | 4.3404 | 10100 | 0.4584 | 0.1436 | 0.6509 |
|
| 159 |
-
| 0.3027 | 4.3834 | 10200 | 0.4373 | 0.1405 | 0.6410 |
|
| 160 |
-
| 0.2991 | 4.4264 | 10300 | 0.4393 | 0.1409 | 0.6284 |
|
| 161 |
-
| 0.2863 | 4.4694 | 10400 | 0.4329 | 0.1374 | 0.6178 |
|
| 162 |
-
| 0.3033 | 4.5124 | 10500 | 0.4144 | 0.1376 | 0.6174 |
|
| 163 |
-
| 0.305 | 4.5553 | 10600 | 0.4284 | 0.1404 | 0.6226 |
|
| 164 |
-
| 0.2966 | 4.5983 | 10700 | 0.4212 | 0.1392 | 0.6254 |
|
| 165 |
-
| 0.3031 | 4.6413 | 10800 | 0.4306 | 0.1364 | 0.6209 |
|
| 166 |
-
| 0.2982 | 4.6843 | 10900 | 0.4324 | 0.1376 | 0.6308 |
|
| 167 |
-
| 0.2901 | 4.7273 | 11000 | 0.4226 | 0.1352 | 0.6162 |
|
| 168 |
-
| 0.2927 | 4.7703 | 11100 | 0.3942 | 0.1302 | 0.6110 |
|
| 169 |
-
| 0.2833 | 4.8132 | 11200 | 0.3964 | 0.1296 | 0.6022 |
|
| 170 |
-
| 0.278 | 4.8562 | 11300 | 0.4226 | 0.1342 | 0.6114 |
|
| 171 |
-
| 0.2919 | 4.8992 | 11400 | 0.4194 | 0.1314 | 0.6069 |
|
| 172 |
-
| 0.307 | 4.9422 | 11500 | 0.4079 | 0.1328 | 0.6110 |
|
| 173 |
-
| 0.2831 | 4.9852 | 11600 | 0.4120 | 0.1304 | 0.5998 |
|
| 174 |
-
| 0.2542 | 5.0279 | 11700 | 0.3995 | 0.1311 | 0.6033 |
|
| 175 |
-
| 0.2439 | 5.0709 | 11800 | 0.4012 | 0.1290 | 0.5999 |
|
| 176 |
-
| 0.231 | 5.1139 | 11900 | 0.4167 | 0.1321 | 0.6037 |
|
| 177 |
-
| 0.2363 | 5.1569 | 12000 | 0.4083 | 0.1316 | 0.5957 |
|
| 178 |
-
| 0.2441 | 5.1999 | 12100 | 0.4134 | 0.1314 | 0.6057 |
|
| 179 |
-
| 0.2372 | 5.2429 | 12200 | 0.4077 | 0.1299 | 0.5977 |
|
| 180 |
-
| 0.2621 | 5.2858 | 12300 | 0.4117 | 0.1315 | 0.5983 |
|
| 181 |
-
| 0.2436 | 5.3288 | 12400 | 0.4146 | 0.1314 | 0.6064 |
|
| 182 |
-
| 0.245 | 5.3718 | 12500 | 0.4080 | 0.1297 | 0.5936 |
|
| 183 |
-
| 0.242 | 5.4148 | 12600 | 0.3986 | 0.1271 | 0.5971 |
|
| 184 |
-
| 0.2427 | 5.4578 | 12700 | 0.3980 | 0.1257 | 0.5828 |
|
| 185 |
-
| 0.2354 | 5.5008 | 12800 | 0.4076 | 0.1271 | 0.5882 |
|
| 186 |
-
| 0.2386 | 5.5437 | 12900 | 0.4129 | 0.1297 | 0.6011 |
|
| 187 |
-
| 0.2452 | 5.5867 | 13000 | 0.4083 | 0.1273 | 0.5902 |
|
| 188 |
-
| 0.2446 | 5.6297 | 13100 | 0.4121 | 0.1302 | 0.6076 |
|
| 189 |
-
| 0.2346 | 5.6727 | 13200 | 0.3906 | 0.1235 | 0.5829 |
|
| 190 |
-
| 0.2402 | 5.7157 | 13300 | 0.3922 | 0.1254 | 0.5896 |
|
| 191 |
-
| 0.2322 | 5.7587 | 13400 | 0.4023 | 0.1284 | 0.5958 |
|
| 192 |
-
| 0.2501 | 5.8016 | 13500 | 0.4004 | 0.1256 | 0.5896 |
|
| 193 |
-
| 0.256 | 5.8446 | 13600 | 0.4003 | 0.1298 | 0.5938 |
|
| 194 |
-
| 0.2448 | 5.8876 | 13700 | 0.3964 | 0.1272 | 0.5909 |
|
| 195 |
-
| 0.2498 | 5.9306 | 13800 | 0.3838 | 0.1249 | 0.5844 |
|
| 196 |
-
| 0.2306 | 5.9736 | 13900 | 0.3833 | 0.1247 | 0.5841 |
|
| 197 |
-
| 0.2185 | 6.0163 | 14000 | 0.3810 | 0.1216 | 0.5763 |
|
| 198 |
-
| 0.1886 | 6.0593 | 14100 | 0.4003 | 0.1221 | 0.5722 |
|
| 199 |
-
| 0.1928 | 6.1023 | 14200 | 0.3930 | 0.1220 | 0.5747 |
|
| 200 |
-
| 0.2063 | 6.1453 | 14300 | 0.3865 | 0.1194 | 0.5664 |
|
| 201 |
-
| 0.1926 | 6.1883 | 14400 | 0.3949 | 0.1210 | 0.5716 |
|
| 202 |
-
| 0.2132 | 6.2312 | 14500 | 0.4062 | 0.1238 | 0.5784 |
|
| 203 |
-
| 0.1993 | 6.2742 | 14600 | 0.3983 | 0.1221 | 0.5703 |
|
| 204 |
-
| 0.2059 | 6.3172 | 14700 | 0.4001 | 0.1235 | 0.5740 |
|
| 205 |
-
| 0.2004 | 6.3602 | 14800 | 0.4002 | 0.1205 | 0.5706 |
|
| 206 |
-
| 0.1975 | 6.4032 | 14900 | 0.3898 | 0.1212 | 0.5679 |
|
| 207 |
-
| 0.1839 | 6.4462 | 15000 | 0.3895 | 0.1170 | 0.5528 |
|
| 208 |
-
| 0.2046 | 6.4891 | 15100 | 0.4025 | 0.1206 | 0.5647 |
|
| 209 |
-
| 0.1967 | 6.5321 | 15200 | 0.4016 | 0.1195 | 0.5670 |
|
| 210 |
-
| 0.1979 | 6.5751 | 15300 | 0.3940 | 0.1182 | 0.5600 |
|
| 211 |
-
| 0.1944 | 6.6181 | 15400 | 0.3863 | 0.1183 | 0.5613 |
|
| 212 |
-
| 0.1979 | 6.6611 | 15500 | 0.3897 | 0.1197 | 0.5589 |
|
| 213 |
-
| 0.1911 | 6.7041 | 15600 | 0.3905 | 0.1156 | 0.5515 |
|
| 214 |
-
| 0.2017 | 6.7470 | 15700 | 0.3779 | 0.1166 | 0.5571 |
|
| 215 |
-
| 0.1925 | 6.7900 | 15800 | 0.3808 | 0.1183 | 0.5625 |
|
| 216 |
-
| 0.2002 | 6.8330 | 15900 | 0.3766 | 0.1177 | 0.5562 |
|
| 217 |
-
| 0.1922 | 6.8760 | 16000 | 0.3909 | 0.1187 | 0.5579 |
|
| 218 |
-
| 0.197 | 6.9190 | 16100 | 0.3716 | 0.1161 | 0.5519 |
|
| 219 |
-
| 0.2047 | 6.9620 | 16200 | 0.3779 | 0.1170 | 0.5550 |
|
| 220 |
-
| 0.202 | 7.0047 | 16300 | 0.3857 | 0.1192 | 0.5586 |
|
| 221 |
-
| 0.1676 | 7.0477 | 16400 | 0.3962 | 0.1194 | 0.5594 |
|
| 222 |
-
| 0.1548 | 7.0907 | 16500 | 0.3981 | 0.1209 | 0.5686 |
|
| 223 |
-
| 0.1703 | 7.1337 | 16600 | 0.3832 | 0.1158 | 0.5527 |
|
| 224 |
-
| 0.1715 | 7.1767 | 16700 | 0.3784 | 0.1141 | 0.5496 |
|
| 225 |
-
| 0.158 | 7.2196 | 16800 | 0.3849 | 0.1160 | 0.5547 |
|
| 226 |
-
| 0.1638 | 7.2626 | 16900 | 0.3892 | 0.1156 | 0.5531 |
|
| 227 |
-
| 0.1592 | 7.3056 | 17000 | 0.3814 | 0.1156 | 0.5484 |
|
| 228 |
-
| 0.1619 | 7.3486 | 17100 | 0.3822 | 0.1151 | 0.5488 |
|
| 229 |
-
| 0.1698 | 7.3916 | 17200 | 0.3677 | 0.1128 | 0.5378 |
|
| 230 |
-
| 0.1538 | 7.4346 | 17300 | 0.3648 | 0.1125 | 0.5396 |
|
| 231 |
-
| 0.1485 | 7.4775 | 17400 | 0.3858 | 0.1141 | 0.5412 |
|
| 232 |
-
| 0.1463 | 7.5205 | 17500 | 0.3804 | 0.1125 | 0.5368 |
|
| 233 |
-
| 0.1527 | 7.5635 | 17600 | 0.3751 | 0.1153 | 0.5481 |
|
| 234 |
-
| 0.1538 | 7.6065 | 17700 | 0.3775 | 0.1119 | 0.5420 |
|
| 235 |
-
| 0.1592 | 7.6495 | 17800 | 0.3816 | 0.1141 | 0.5455 |
|
| 236 |
-
| 0.1588 | 7.6925 | 17900 | 0.3929 | 0.1167 | 0.5519 |
|
| 237 |
-
| 0.1505 | 7.7354 | 18000 | 0.3779 | 0.1116 | 0.5380 |
|
| 238 |
-
| 0.1478 | 7.7784 | 18100 | 0.3631 | 0.1103 | 0.5358 |
|
| 239 |
-
| 0.1455 | 7.8214 | 18200 | 0.3775 | 0.1111 | 0.5380 |
|
| 240 |
-
| 0.1468 | 7.8644 | 18300 | 0.3652 | 0.1106 | 0.5374 |
|
| 241 |
-
| 0.1533 | 7.9074 | 18400 | 0.3684 | 0.1096 | 0.5338 |
|
| 242 |
-
| 0.1537 | 7.9504 | 18500 | 0.3649 | 0.1114 | 0.5354 |
|
| 243 |
-
| 0.1526 | 7.9933 | 18600 | 0.3641 | 0.1095 | 0.5304 |
|
| 244 |
-
| 0.1236 | 8.0361 | 18700 | 0.4009 | 0.1135 | 0.5424 |
|
| 245 |
-
| 0.1223 | 8.0791 | 18800 | 0.3958 | 0.1102 | 0.5377 |
|
| 246 |
-
| 0.1386 | 8.1221 | 18900 | 0.3801 | 0.1088 | 0.5327 |
|
| 247 |
-
| 0.1281 | 8.1651 | 19000 | 0.3892 | 0.1094 | 0.5355 |
|
| 248 |
-
| 0.1324 | 8.2080 | 19100 | 0.3790 | 0.1093 | 0.5341 |
|
| 249 |
-
| 0.1293 | 8.2510 | 19200 | 0.3810 | 0.1096 | 0.5403 |
|
| 250 |
-
| 0.1238 | 8.2940 | 19300 | 0.3853 | 0.1088 | 0.5301 |
|
| 251 |
-
| 0.1355 | 8.3370 | 19400 | 0.3915 | 0.1098 | 0.5322 |
|
| 252 |
-
| 0.1222 | 8.3800 | 19500 | 0.3811 | 0.1086 | 0.5320 |
|
| 253 |
-
| 0.1258 | 8.4230 | 19600 | 0.3920 | 0.1080 | 0.5276 |
|
| 254 |
-
| 0.1209 | 8.4659 | 19700 | 0.3642 | 0.1068 | 0.5203 |
|
| 255 |
-
| 0.1256 | 8.5089 | 19800 | 0.3714 | 0.1063 | 0.5231 |
|
| 256 |
-
| 0.1213 | 8.5519 | 19900 | 0.3784 | 0.1062 | 0.5227 |
|
| 257 |
-
| 0.1227 | 8.5949 | 20000 | 0.3655 | 0.1046 | 0.5187 |
|
| 258 |
-
| 0.1097 | 8.6379 | 20100 | 0.3829 | 0.1055 | 0.5219 |
|
| 259 |
-
| 0.1162 | 8.6809 | 20200 | 0.3693 | 0.1051 | 0.5225 |
|
| 260 |
-
| 0.1173 | 8.7238 | 20300 | 0.3755 | 0.1054 | 0.5227 |
|
| 261 |
-
| 0.1199 | 8.7668 | 20400 | 0.3675 | 0.1051 | 0.5167 |
|
| 262 |
-
| 0.1203 | 8.8098 | 20500 | 0.3571 | 0.1039 | 0.5163 |
|
| 263 |
-
| 0.1198 | 8.8528 | 20600 | 0.3645 | 0.1028 | 0.5091 |
|
| 264 |
-
| 0.1215 | 8.8958 | 20700 | 0.3629 | 0.1030 | 0.5122 |
|
| 265 |
-
| 0.1261 | 8.9387 | 20800 | 0.3519 | 0.1025 | 0.5136 |
|
| 266 |
-
| 0.111 | 8.9817 | 20900 | 0.3633 | 0.1037 | 0.5141 |
|
| 267 |
-
| 0.1108 | 9.0245 | 21000 | 0.3809 | 0.1033 | 0.5119 |
|
| 268 |
-
| 0.1095 | 9.0675 | 21100 | 0.3689 | 0.1025 | 0.5094 |
|
| 269 |
-
| 0.0993 | 9.1105 | 21200 | 0.3796 | 0.1027 | 0.5100 |
|
| 270 |
-
| 0.1039 | 9.1534 | 21300 | 0.3741 | 0.1036 | 0.5149 |
|
| 271 |
-
| 0.0981 | 9.1964 | 21400 | 0.3857 | 0.1031 | 0.5152 |
|
| 272 |
-
| 0.0996 | 9.2394 | 21500 | 0.3793 | 0.1024 | 0.5126 |
|
| 273 |
-
| 0.0991 | 9.2824 | 21600 | 0.3801 | 0.1024 | 0.5132 |
|
| 274 |
-
| 0.0959 | 9.3254 | 21700 | 0.3819 | 0.1014 | 0.5105 |
|
| 275 |
-
| 0.1009 | 9.3684 | 21800 | 0.3879 | 0.1023 | 0.5117 |
|
| 276 |
-
| 0.0942 | 9.4113 | 21900 | 0.3898 | 0.1027 | 0.5127 |
|
| 277 |
-
| 0.0908 | 9.4543 | 22000 | 0.3916 | 0.1023 | 0.5109 |
|
| 278 |
-
| 0.0971 | 9.4973 | 22100 | 0.3891 | 0.1024 | 0.5115 |
|
| 279 |
-
| 0.0923 | 9.5403 | 22200 | 0.3957 | 0.1023 | 0.5122 |
|
| 280 |
-
| 0.0835 | 9.5833 | 22300 | 0.3866 | 0.1016 | 0.5092 |
|
| 281 |
-
| 0.1 | 9.6263 | 22400 | 0.3859 | 0.1015 | 0.5067 |
|
| 282 |
-
| 0.0945 | 9.6692 | 22500 | 0.3830 | 0.1016 | 0.5063 |
|
| 283 |
-
| 0.0941 | 9.7122 | 22600 | 0.3809 | 0.1018 | 0.5045 |
|
| 284 |
-
| 0.0973 | 9.7552 | 22700 | 0.3828 | 0.1012 | 0.5036 |
|
| 285 |
-
| 0.0909 | 9.7982 | 22800 | 0.3850 | 0.1012 | 0.5071 |
|
| 286 |
-
| 0.0901 | 9.8412 | 22900 | 0.3848 | 0.1009 | 0.5055 |
|
| 287 |
-
| 0.0839 | 9.8842 | 23000 | 0.3854 | 0.1010 | 0.5051 |
|
| 288 |
-
| 0.0927 | 9.9271 | 23100 | 0.3861 | 0.1013 | 0.5059 |
|
| 289 |
-
| 0.0892 | 9.9701 | 23200 | 0.3854 | 0.1012 | 0.5053 |
|
| 290 |
|
| 291 |
|
| 292 |
### Framework versions
|
|
|
|
| 18 |
|
| 19 |
This model is a fine-tuned version of [facebook/mms-1b-all](https://huggingface.co/facebook/mms-1b-all) on the None dataset.
|
| 20 |
It achieves the following results on the evaluation set:
|
| 21 |
+
- Loss: 0.2559
|
| 22 |
+
- Cer: 0.0920
|
| 23 |
+
- Wer: 0.5172
|
| 24 |
|
| 25 |
## Model description
|
| 26 |
|
|
|
|
| 40 |
|
| 41 |
The following hyperparameters were used during training:
|
| 42 |
- learning_rate: 0.0003
|
| 43 |
+
- train_batch_size: 8
|
| 44 |
+
- eval_batch_size: 12
|
| 45 |
- seed: 42
|
| 46 |
- gradient_accumulation_steps: 2
|
| 47 |
+
- total_train_batch_size: 16
|
| 48 |
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 49 |
- lr_scheduler_type: linear
|
| 50 |
+
- lr_scheduler_warmup_steps: 100
|
| 51 |
- num_epochs: 10
|
| 52 |
- mixed_precision_training: Native AMP
|
| 53 |
|
|
|
|
| 55 |
|
| 56 |
| Training Loss | Epoch | Step | Validation Loss | Cer | Wer |
|
| 57 |
|:-------------:|:------:|:-----:|:---------------:|:------:|:------:|
|
| 58 |
+
| 0.8242 | 0.1719 | 200 | 0.6052 | 0.1907 | 0.8410 |
|
| 59 |
+
| 0.5396 | 0.3438 | 400 | 0.4636 | 0.1535 | 0.7533 |
|
| 60 |
+
| 0.4706 | 0.5157 | 600 | 0.4237 | 0.1411 | 0.6953 |
|
| 61 |
+
| 0.4313 | 0.6876 | 800 | 0.3889 | 0.1342 | 0.6967 |
|
| 62 |
+
| 0.399 | 0.8595 | 1000 | 0.3817 | 0.1263 | 0.6548 |
|
| 63 |
+
| 0.3835 | 1.0309 | 1200 | 0.3536 | 0.1204 | 0.6379 |
|
| 64 |
+
| 0.4002 | 1.2028 | 1400 | 0.3461 | 0.1178 | 0.6223 |
|
| 65 |
+
| 0.3667 | 1.3747 | 1600 | 0.3403 | 0.1168 | 0.6230 |
|
| 66 |
+
| 0.3641 | 1.5466 | 1800 | 0.3356 | 0.1158 | 0.6277 |
|
| 67 |
+
| 0.3461 | 1.7185 | 2000 | 0.3271 | 0.1127 | 0.6118 |
|
| 68 |
+
| 0.3539 | 1.8904 | 2200 | 0.3223 | 0.1109 | 0.6007 |
|
| 69 |
+
| 0.3404 | 2.0619 | 2400 | 0.3188 | 0.1093 | 0.5941 |
|
| 70 |
+
| 0.3285 | 2.2338 | 2600 | 0.3115 | 0.1083 | 0.5927 |
|
| 71 |
+
| 0.3332 | 2.4057 | 2800 | 0.3093 | 0.1075 | 0.5888 |
|
| 72 |
+
| 0.3276 | 2.5776 | 3000 | 0.3062 | 0.1047 | 0.5783 |
|
| 73 |
+
| 0.3274 | 2.7495 | 3200 | 0.3033 | 0.1045 | 0.5749 |
|
| 74 |
+
| 0.3137 | 2.9214 | 3400 | 0.2981 | 0.1042 | 0.5717 |
|
| 75 |
+
| 0.3095 | 3.0928 | 3600 | 0.3001 | 0.1050 | 0.5807 |
|
| 76 |
+
| 0.3146 | 3.2647 | 3800 | 0.3041 | 0.1058 | 0.5788 |
|
| 77 |
+
| 0.3147 | 3.4366 | 4000 | 0.2922 | 0.1039 | 0.5865 |
|
| 78 |
+
| 0.2873 | 3.6085 | 4200 | 0.2905 | 0.1013 | 0.5628 |
|
| 79 |
+
| 0.2973 | 3.7804 | 4400 | 0.2887 | 0.1014 | 0.5590 |
|
| 80 |
+
| 0.3028 | 3.9523 | 4600 | 0.2853 | 0.1011 | 0.5583 |
|
| 81 |
+
| 0.2747 | 4.1238 | 4800 | 0.2881 | 0.0983 | 0.5490 |
|
| 82 |
+
| 0.2928 | 4.2957 | 5000 | 0.2897 | 0.1000 | 0.5556 |
|
| 83 |
+
| 0.2825 | 4.4676 | 5200 | 0.2872 | 0.0982 | 0.5492 |
|
| 84 |
+
| 0.2861 | 4.6394 | 5400 | 0.2820 | 0.0990 | 0.5535 |
|
| 85 |
+
| 0.277 | 4.8113 | 5600 | 0.2831 | 0.0986 | 0.5509 |
|
| 86 |
+
| 0.2827 | 4.9832 | 5800 | 0.2805 | 0.0970 | 0.5434 |
|
| 87 |
+
| 0.2695 | 5.1547 | 6000 | 0.2758 | 0.0970 | 0.5455 |
|
| 88 |
+
| 0.2696 | 5.3266 | 6200 | 0.2748 | 0.0962 | 0.5396 |
|
| 89 |
+
| 0.2834 | 5.4985 | 6400 | 0.2716 | 0.0966 | 0.5408 |
|
| 90 |
+
| 0.2786 | 5.6704 | 6600 | 0.2786 | 0.0970 | 0.5362 |
|
| 91 |
+
| 0.2741 | 5.8423 | 6800 | 0.2693 | 0.0948 | 0.5315 |
|
| 92 |
+
| 0.2816 | 6.0138 | 7000 | 0.2697 | 0.0952 | 0.5330 |
|
| 93 |
+
| 0.2587 | 6.1856 | 7200 | 0.2682 | 0.0951 | 0.5347 |
|
| 94 |
+
| 0.2703 | 6.3575 | 7400 | 0.2666 | 0.0940 | 0.5304 |
|
| 95 |
+
| 0.2503 | 6.5294 | 7600 | 0.2671 | 0.0949 | 0.5327 |
|
| 96 |
+
| 0.2656 | 6.7013 | 7800 | 0.2654 | 0.0944 | 0.5284 |
|
| 97 |
+
| 0.2565 | 6.8732 | 8000 | 0.2668 | 0.0935 | 0.5246 |
|
| 98 |
+
| 0.2518 | 7.0447 | 8200 | 0.2683 | 0.0932 | 0.5262 |
|
| 99 |
+
| 0.2477 | 7.2166 | 8400 | 0.2666 | 0.0930 | 0.5281 |
|
| 100 |
+
| 0.2575 | 7.3885 | 8600 | 0.2632 | 0.0932 | 0.5227 |
|
| 101 |
+
| 0.2523 | 7.5604 | 8800 | 0.2640 | 0.0932 | 0.5242 |
|
| 102 |
+
| 0.2383 | 7.7323 | 9000 | 0.2622 | 0.0928 | 0.5207 |
|
| 103 |
+
| 0.2366 | 7.9042 | 9200 | 0.2629 | 0.0931 | 0.5230 |
|
| 104 |
+
| 0.2381 | 8.0756 | 9400 | 0.2606 | 0.0926 | 0.5198 |
|
| 105 |
+
| 0.24 | 8.2475 | 9600 | 0.2609 | 0.0921 | 0.5171 |
|
| 106 |
+
| 0.2408 | 8.4194 | 9800 | 0.2590 | 0.0923 | 0.5185 |
|
| 107 |
+
| 0.2443 | 8.5913 | 10000 | 0.2575 | 0.0916 | 0.5171 |
|
| 108 |
+
| 0.251 | 8.7632 | 10200 | 0.2579 | 0.0919 | 0.5160 |
|
| 109 |
+
| 0.2418 | 8.9351 | 10400 | 0.2578 | 0.0915 | 0.5156 |
|
| 110 |
+
| 0.2382 | 9.1066 | 10600 | 0.2570 | 0.0912 | 0.5142 |
|
| 111 |
+
| 0.2342 | 9.2785 | 10800 | 0.2560 | 0.0915 | 0.5159 |
|
| 112 |
+
| 0.2297 | 9.4504 | 11000 | 0.2568 | 0.0917 | 0.5146 |
|
| 113 |
+
| 0.2365 | 9.6223 | 11200 | 0.2557 | 0.0917 | 0.5163 |
|
| 114 |
+
| 0.2275 | 9.7942 | 11400 | 0.2565 | 0.0918 | 0.5172 |
|
| 115 |
+
| 0.2436 | 9.9661 | 11600 | 0.2559 | 0.0920 | 0.5172 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 116 |
|
| 117 |
|
| 118 |
### Framework versions
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 3858978032
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:10c9540b2687bd08ece13e01de106fb434470a613aa86ef29f686ee95a306062
|
| 3 |
size 3858978032
|