Model save

Browse files

Files changed (4) hide show

README.md +67 -37
config.json +4 -4
model.safetensors +1 -1
training_args.bin +2 -2

README.md CHANGED Viewed

@@ -17,8 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [facebook/wav2vec2-base-960h](https://huggingface.co/facebook/wav2vec2-base-960h) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 135.5781
-- Wer: 0.0427
 ## Model description
@@ -37,49 +37,79 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 8e-05
 - train_batch_size: 32
 - eval_batch_size: 32
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: constant
-- num_epochs: 30
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Wer    |
-|:-------------:|:-----:|:----:|:---------------:|:------:|
-| 1518.9794     | 1.0   | 168  | 1000.8097       | 0.4552 |
-| 1028.341      | 2.0   | 336  | 779.6321        | 0.3719 |
-| 820.663       | 3.0   | 504  | 659.3882        | 0.2988 |
-| 718.5459      | 4.0   | 672  | 516.9663        | 0.2303 |
-| 606.0076      | 5.0   | 840  | 421.7630        | 0.1998 |
-| 535.3864      | 6.0   | 1008 | 478.5203        | 0.2051 |
-| 466.5894      | 7.0   | 1176 | 440.6602        | 0.1739 |
-| 432.4227      | 8.0   | 1344 | 294.9408        | 0.1323 |
-| 373.2972      | 9.0   | 1512 | 261.7947        | 0.1122 |
-| 355.762       | 10.0  | 1680 | 315.7706        | 0.1300 |
-| 314.1884      | 11.0  | 1848 | 350.8355        | 0.1181 |
-| 288.5761      | 12.0  | 2016 | 310.4185        | 0.1201 |
-| 291.121       | 13.0  | 2184 | 255.1681        | 0.0876 |
-| 236.6435      | 14.0  | 2352 | 240.4549        | 0.0873 |
-| 219.9664      | 15.0  | 2520 | 237.7248        | 0.0922 |
-| 201.4798      | 16.0  | 2688 | 162.6640        | 0.0619 |
-| 199.2839      | 17.0  | 2856 | 232.3585        | 0.0800 |
-| 194.1537      | 18.0  | 3024 | 215.3707        | 0.0772 |
-| 177.814       | 19.0  | 3192 | 171.7732        | 0.0589 |
-| 166.7409      | 20.0  | 3360 | 166.2487        | 0.0597 |
-| 169.7996      | 21.0  | 3528 | 135.5038        | 0.0546 |
-| 153.8049      | 22.0  | 3696 | 150.6883        | 0.0518 |
-| 143.4673      | 23.0  | 3864 | 179.0132        | 0.0541 |
-| 147.514       | 24.0  | 4032 | 131.5579        | 0.0419 |
-| 138.0108      | 25.0  | 4200 | 154.8247        | 0.0493 |
-| 142.9634      | 26.0  | 4368 | 165.7421        | 0.0586 |
-| 123.1378      | 27.0  | 4536 | 160.7584        | 0.0472 |
-| 129.9836      | 28.0  | 4704 | 104.9703        | 0.0366 |
-| 113.1207      | 29.0  | 4872 | 172.9598        | 0.0490 |
-| 110.3937      | 30.0  | 5040 | 135.5781        | 0.0427 |
 ### Framework versions

 This model is a fine-tuned version of [facebook/wav2vec2-base-960h](https://huggingface.co/facebook/wav2vec2-base-960h) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 217.4995
+- Wer: 0.0488
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 0.0001
 - train_batch_size: 32
 - eval_batch_size: 32
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- num_epochs: 60
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Wer    |
+|:-------------:|:-----:|:-----:|:---------------:|:------:|
+| 1472.3185     | 1.0   | 220   | 996.5219        | 0.4717 |
+| 988.7808      | 2.0   | 440   | 742.4960        | 0.3192 |
+| 758.6794      | 3.0   | 660   | 726.9413        | 0.2945 |
+| 601.6771      | 4.0   | 880   | 525.1385        | 0.2309 |
+| 517.6635      | 5.0   | 1100  | 579.2085        | 0.2188 |
+| 454.2735      | 6.0   | 1320  | 436.7187        | 0.1770 |
+| 374.8431      | 7.0   | 1540  | 509.1526        | 0.1835 |
+| 320.3781      | 8.0   | 1760  | 405.3188        | 0.1558 |
+| 305.5322      | 9.0   | 1980  | 305.2030        | 0.1185 |
+| 295.8101      | 10.0  | 2200  | 348.2143        | 0.1369 |
+| 252.8921      | 11.0  | 2420  | 295.6262        | 0.1027 |
+| 225.9937      | 12.0  | 2640  | 338.8898        | 0.1057 |
+| 186.0668      | 13.0  | 2860  | 188.0684        | 0.0760 |
+| 216.4071      | 14.0  | 3080  | 297.2408        | 0.0911 |
+| 188.2282      | 15.0  | 3300  | 297.4695        | 0.1007 |
+| 180.0134      | 16.0  | 3520  | 319.2300        | 0.0958 |
+| 158.7572      | 17.0  | 3740  | 231.6492        | 0.0735 |
+| 151.4667      | 18.0  | 3960  | 390.5204        | 0.1033 |
+| 134.8128      | 19.0  | 4180  | 302.3186        | 0.0873 |
+| 128.7475      | 20.0  | 4400  | 254.5555        | 0.0792 |
+| 143.8669      | 21.0  | 4620  | 292.1411        | 0.0843 |
+| 103.34        | 22.0  | 4840  | 418.9893        | 0.0903 |
+| 122.2672      | 23.0  | 5060  | 364.5801        | 0.0954 |
+| 97.266        | 24.0  | 5280  | 285.9376        | 0.0756 |
+| 95.7884       | 25.0  | 5500  | 171.5240        | 0.0543 |
+| 103.1276      | 26.0  | 5720  | 261.3536        | 0.0662 |
+| 78.8389       | 27.0  | 5940  | 293.9692        | 0.0776 |
+| 92.2303       | 28.0  | 6160  | 332.2821        | 0.0717 |
+| 81.0719       | 29.0  | 6380  | 283.8300        | 0.0640 |
+| 81.7117       | 30.0  | 6600  | 194.0315        | 0.0512 |
+| 82.8297       | 31.0  | 6820  | 268.2238        | 0.0624 |
+| 66.1784       | 32.0  | 7040  | 172.1759        | 0.0429 |
+| 85.3869       | 33.0  | 7260  | 218.6479        | 0.0585 |
+| 66.2284       | 34.0  | 7480  | 249.1338        | 0.0596 |
+| 65.4866       | 35.0  | 7700  | 193.9302        | 0.0478 |
+| 59.2717       | 36.0  | 7920  | 178.9640        | 0.0464 |
+| 57.6191       | 37.0  | 8140  | 177.6202        | 0.0480 |
+| 50.2033       | 38.0  | 8360  | 174.2884        | 0.0405 |
+| 62.3403       | 39.0  | 8580  | 222.0836        | 0.0519 |
+| 64.9          | 40.0  | 8800  | 184.1204        | 0.0433 |
+| 48.9848       | 41.0  | 9020  | 159.7262        | 0.0399 |
+| 53.4291       | 42.0  | 9240  | 169.9478        | 0.0393 |
+| 64.6475       | 43.0  | 9460  | 214.1101        | 0.0531 |
+| 56.4478       | 44.0  | 9680  | 176.6035        | 0.0458 |
+| 46.7233       | 45.0  | 9900  | 224.5517        | 0.0504 |
+| 49.9667       | 46.0  | 10120 | 190.4445        | 0.0472 |
+| 46.9387       | 47.0  | 10340 | 206.6020        | 0.0486 |
+| 42.602        | 48.0  | 10560 | 196.2499        | 0.0470 |
+| 42.0918       | 49.0  | 10780 | 177.2713        | 0.0460 |
+| 44.1689       | 50.0  | 11000 | 186.2561        | 0.0466 |
+| 41.0818       | 51.0  | 11220 | 212.5681        | 0.0508 |
+| 37.013        | 52.0  | 11440 | 207.1802        | 0.0484 |
+| 36.6795       | 53.0  | 11660 | 205.6143        | 0.0468 |
+| 49.4436       | 54.0  | 11880 | 210.2148        | 0.0486 |
+| 47.5393       | 55.0  | 12100 | 216.7866        | 0.0498 |
+| 47.0014       | 56.0  | 12320 | 216.6146        | 0.0490 |
+| 52.2467       | 57.0  | 12540 | 218.1209        | 0.0494 |
+| 51.5985       | 58.0  | 12760 | 217.3818        | 0.0490 |
+| 51.9654       | 59.0  | 12980 | 217.6073        | 0.0490 |
+| 40.1137       | 60.0  | 13200 | 217.4995        | 0.0488 |
 ### Framework versions

config.json CHANGED Viewed

@@ -9,7 +9,7 @@
   "architectures": [
     "Wav2Vec2ForCTC"
   ],
-  "attention_dropout": 0.15,
   "bos_token_id": 1,
   "classifier_proj_size": 256,
   "codevector_dim": 256,
@@ -50,12 +50,12 @@
   "feat_extract_activation": "gelu",
   "feat_extract_dropout": 0.0,
   "feat_extract_norm": "group",
-  "feat_proj_dropout": 0.15,
   "feat_quantizer_dropout": 0.0,
-  "final_dropout": 0.15,
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
-  "hidden_dropout": 0.15,
   "hidden_dropout_prob": 0.1,
   "hidden_size": 768,
   "initializer_range": 0.02,

   "architectures": [
     "Wav2Vec2ForCTC"
   ],
+  "attention_dropout": 0.1,
   "bos_token_id": 1,
   "classifier_proj_size": 256,
   "codevector_dim": 256,
   "feat_extract_activation": "gelu",
   "feat_extract_dropout": 0.0,
   "feat_extract_norm": "group",
+  "feat_proj_dropout": 0.1,
   "feat_quantizer_dropout": 0.0,
+  "final_dropout": 0.1,
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
+  "hidden_dropout": 0.1,
   "hidden_dropout_prob": 0.1,
   "hidden_size": 768,
   "initializer_range": 0.02,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b8fe0a83b85190e9c5df256386b63f109df3c6d5bc2d19218bec38e8582c02ab
 size 377611120

 version https://git-lfs.github.com/spec/v1
+oid sha256:95cc569bad2e2e6c1ac4216a9ac78ae92fffa33ab8a87c442f257d3d58466dde
 size 377611120

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6cb190c3f93a398db49cf968b1143a0a2d6ea9ba39aea61f61608d6d31dfaa5a
-size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:c9442254a5e10f72e35377f72189cac35be46ef05f53ff1f21623bdb0926e4b3
+size 5240