Model save

Browse files

Files changed (4) hide show

README.md +33 -33
config.json +4 -4
model.safetensors +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,8 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [facebook/wav2vec2-base-960h](https://huggingface.co/facebook/wav2vec2-base-960h) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 75.0289
-- Wer: 0.0338
 ## Model description
@@ -37,7 +37,7 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 7e-05
 - train_batch_size: 32
 - eval_batch_size: 32
 - seed: 42
@@ -50,36 +50,36 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Wer    |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
-| 1580.0143     | 1.0   | 165  | 894.1667        | 0.4445 |
-| 1064.4354     | 2.0   | 330  | 644.3724        | 0.3276 |
-| 869.033       | 3.0   | 495  | 618.4804        | 0.2878 |
-| 765.606       | 4.0   | 660  | 467.2128        | 0.2274 |
-| 648.5554      | 5.0   | 825  | 465.9539        | 0.2182 |
-| 590.8423      | 6.0   | 990  | 417.7243        | 0.2012 |
-| 500.8294      | 7.0   | 1155 | 360.0778        | 0.1614 |
-| 468.3419      | 8.0   | 1320 | 423.7473        | 0.1842 |
-| 438.408       | 9.0   | 1485 | 343.4666        | 0.1572 |
-| 405.389       | 10.0  | 1650 | 272.7067        | 0.1332 |
-| 338.7186      | 11.0  | 1815 | 239.3145        | 0.1133 |
-| 335.6381      | 12.0  | 1980 | 183.9102        | 0.0918 |
-| 327.5601      | 13.0  | 2145 | 175.3572        | 0.0827 |
-| 255.2483      | 14.0  | 2310 | 175.2945        | 0.0848 |
-| 257.8507      | 15.0  | 2475 | 149.6469        | 0.0733 |
-| 221.9156      | 16.0  | 2640 | 177.4545        | 0.0756 |
-| 232.816       | 17.0  | 2805 | 226.1758        | 0.0984 |
-| 213.0188      | 18.0  | 2970 | 178.6724        | 0.0756 |
-| 207.679       | 19.0  | 3135 | 131.9617        | 0.0534 |
-| 198.4934      | 20.0  | 3300 | 132.1202        | 0.0484 |
-| 177.8431      | 21.0  | 3465 | 86.7172         | 0.0447 |
-| 178.1199      | 22.0  | 3630 | 93.3159         | 0.0447 |
-| 178.4063      | 23.0  | 3795 | 103.9663        | 0.0463 |
-| 170.2629      | 24.0  | 3960 | 96.0694         | 0.0385 |
-| 147.0691      | 25.0  | 4125 | 90.9803         | 0.0361 |
-| 139.5575      | 26.0  | 4290 | 89.6164         | 0.0385 |
-| 141.0887      | 27.0  | 4455 | 64.7452         | 0.0311 |
-| 138.4675      | 28.0  | 4620 | 87.2011         | 0.0385 |
-| 137.3686      | 29.0  | 4785 | 60.8403         | 0.0277 |
-| 108.6982      | 30.0  | 4950 | 75.0289         | 0.0338 |
 ### Framework versions

 This model is a fine-tuned version of [facebook/wav2vec2-base-960h](https://huggingface.co/facebook/wav2vec2-base-960h) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 135.5781
+- Wer: 0.0427
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 8e-05
 - train_batch_size: 32
 - eval_batch_size: 32
 - seed: 42
 | Training Loss | Epoch | Step | Validation Loss | Wer    |
 |:-------------:|:-----:|:----:|:---------------:|:------:|
+| 1518.9794     | 1.0   | 168  | 1000.8097       | 0.4552 |
+| 1028.341      | 2.0   | 336  | 779.6321        | 0.3719 |
+| 820.663       | 3.0   | 504  | 659.3882        | 0.2988 |
+| 718.5459      | 4.0   | 672  | 516.9663        | 0.2303 |
+| 606.0076      | 5.0   | 840  | 421.7630        | 0.1998 |
+| 535.3864      | 6.0   | 1008 | 478.5203        | 0.2051 |
+| 466.5894      | 7.0   | 1176 | 440.6602        | 0.1739 |
+| 432.4227      | 8.0   | 1344 | 294.9408        | 0.1323 |
+| 373.2972      | 9.0   | 1512 | 261.7947        | 0.1122 |
+| 355.762       | 10.0  | 1680 | 315.7706        | 0.1300 |
+| 314.1884      | 11.0  | 1848 | 350.8355        | 0.1181 |
+| 288.5761      | 12.0  | 2016 | 310.4185        | 0.1201 |
+| 291.121       | 13.0  | 2184 | 255.1681        | 0.0876 |
+| 236.6435      | 14.0  | 2352 | 240.4549        | 0.0873 |
+| 219.9664      | 15.0  | 2520 | 237.7248        | 0.0922 |
+| 201.4798      | 16.0  | 2688 | 162.6640        | 0.0619 |
+| 199.2839      | 17.0  | 2856 | 232.3585        | 0.0800 |
+| 194.1537      | 18.0  | 3024 | 215.3707        | 0.0772 |
+| 177.814       | 19.0  | 3192 | 171.7732        | 0.0589 |
+| 166.7409      | 20.0  | 3360 | 166.2487        | 0.0597 |
+| 169.7996      | 21.0  | 3528 | 135.5038        | 0.0546 |
+| 153.8049      | 22.0  | 3696 | 150.6883        | 0.0518 |
+| 143.4673      | 23.0  | 3864 | 179.0132        | 0.0541 |
+| 147.514       | 24.0  | 4032 | 131.5579        | 0.0419 |
+| 138.0108      | 25.0  | 4200 | 154.8247        | 0.0493 |
+| 142.9634      | 26.0  | 4368 | 165.7421        | 0.0586 |
+| 123.1378      | 27.0  | 4536 | 160.7584        | 0.0472 |
+| 129.9836      | 28.0  | 4704 | 104.9703        | 0.0366 |
+| 113.1207      | 29.0  | 4872 | 172.9598        | 0.0490 |
+| 110.3937      | 30.0  | 5040 | 135.5781        | 0.0427 |
 ### Framework versions

config.json CHANGED Viewed

@@ -9,7 +9,7 @@
   "architectures": [
     "Wav2Vec2ForCTC"
   ],
-  "attention_dropout": 0.2,
   "bos_token_id": 1,
   "classifier_proj_size": 256,
   "codevector_dim": 256,
@@ -50,12 +50,12 @@
   "feat_extract_activation": "gelu",
   "feat_extract_dropout": 0.0,
   "feat_extract_norm": "group",
-  "feat_proj_dropout": 0.2,
   "feat_quantizer_dropout": 0.0,
-  "final_dropout": 0.2,
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
-  "hidden_dropout": 0.2,
   "hidden_dropout_prob": 0.1,
   "hidden_size": 768,
   "initializer_range": 0.02,

   "architectures": [
     "Wav2Vec2ForCTC"
   ],
+  "attention_dropout": 0.15,
   "bos_token_id": 1,
   "classifier_proj_size": 256,
   "codevector_dim": 256,
   "feat_extract_activation": "gelu",
   "feat_extract_dropout": 0.0,
   "feat_extract_norm": "group",
+  "feat_proj_dropout": 0.15,
   "feat_quantizer_dropout": 0.0,
+  "final_dropout": 0.15,
   "gradient_checkpointing": false,
   "hidden_act": "gelu",
+  "hidden_dropout": 0.15,
   "hidden_dropout_prob": 0.1,
   "hidden_size": 768,
   "initializer_range": 0.02,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b4ec641f9b257ef7c8c09b29da1143c80b592ba9a8a28c40fa31b9596c1919e3
 size 377611120

 version https://git-lfs.github.com/spec/v1
+oid sha256:b8fe0a83b85190e9c5df256386b63f109df3c6d5bc2d19218bec38e8582c02ab
 size 377611120

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e6b2cba4971a81e5703a54274bd33925e85207e462105f2ec00de6d59969abe3
 size 5176

 version https://git-lfs.github.com/spec/v1
+oid sha256:6cb190c3f93a398db49cf968b1143a0a2d6ea9ba39aea61f61608d6d31dfaa5a
 size 5176