训练结束，上传最终模型

Files changed (5) hide show

README.md CHANGED Viewed

@@ -16,8 +16,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model was trained from scratch on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.1515
-- Wer: inf
 ## Model description
@@ -50,18 +51,18 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch   | Step  | Validation Loss | Wer |
-|:-------------:|:-------:|:-----:|:---------------:|:---:|
-| 1.834         | 1.6507  | 1000  | 1.9117          | inf |
-| 0.9955        | 3.3006  | 2000  | 1.2766          | inf |
-| 0.7584        | 4.9513  | 3000  | 1.1081          | inf |
-| 0.5473        | 6.6012  | 4000  | 1.0569          | inf |
-| 0.4191        | 8.2510  | 5000  | 1.0568          | inf |
-| 0.3167        | 9.9017  | 6000  | 1.0609          | inf |
-| 0.2322        | 11.5516 | 7000  | 1.0933          | inf |
-| 0.1913        | 13.2015 | 8000  | 1.1227          | inf |
-| 0.1424        | 14.8522 | 9000  | 1.1418          | inf |
-| 0.1615        | 16.5021 | 10000 | 1.1515          | inf |
 ### Framework versions

 This model was trained from scratch on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.1519
+- Wer: 96.7751
+- Cer: 49.6435
 ## Model description
 ### Training results
+| Training Loss | Epoch   | Step  | Validation Loss | Wer     | Cer     |
+|:-------------:|:-------:|:-----:|:---------------:|:-------:|:-------:|
+| 1.8339        | 1.6507  | 1000  | 1.9115          | 99.6794 | 93.6205 |
+| 0.9948        | 3.3006  | 2000  | 1.2763          | 97.3503 | 59.4213 |
+| 0.7577        | 4.9513  | 3000  | 1.1085          | 96.6431 | 53.3468 |
+| 0.5464        | 6.6012  | 4000  | 1.0575          | 95.4927 | 48.2507 |
+| 0.4182        | 8.2510  | 5000  | 1.0574          | 96.2376 | 47.2929 |
+| 0.3164        | 9.9017  | 6000  | 1.0616          | 96.3885 | 49.4417 |
+| 0.2319        | 11.5516 | 7000  | 1.0929          | 96.2565 | 49.5535 |
+| 0.1899        | 13.2015 | 8000  | 1.1223          | 97.2749 | 48.6737 |
+| 0.1425        | 14.8522 | 9000  | 1.1422          | 96.6148 | 48.4484 |
+| 0.161         | 16.5021 | 10000 | 1.1519          | 96.7751 | 49.6435 |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 16.50206440957886,
     "total_flos": 1.949150849531904e+19,
-    "train_loss": 0.8644906110763549,
-    "train_runtime": 22776.5193,
-    "train_samples_per_second": 21.074,
-    "train_steps_per_second": 0.439
 }

 {
     "epoch": 16.50206440957886,
     "total_flos": 1.949150849531904e+19,
+    "train_loss": 0.8641470371723176,
+    "train_runtime": 21435.9254,
+    "train_samples_per_second": 22.392,
+    "train_steps_per_second": 0.467
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f1172332b4f4cf2ce6af9c26f710924116b9939c47b97977e6e6ac30b8704561
 size 223144592

 version https://git-lfs.github.com/spec/v1
+oid sha256:a1695d6ce8bd5b7d2355edbbd0898111b45ff3435a1ab08b8365805bc3f3ad00
 size 223144592

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
     "epoch": 16.50206440957886,
     "total_flos": 1.949150849531904e+19,
-    "train_loss": 0.8644906110763549,
-    "train_runtime": 22776.5193,
-    "train_samples_per_second": 21.074,
-    "train_steps_per_second": 0.439
 }

 {
     "epoch": 16.50206440957886,
     "total_flos": 1.949150849531904e+19,
+    "train_loss": 0.8641470371723176,
+    "train_runtime": 21435.9254,
+    "train_samples_per_second": 22.392,
+    "train_steps_per_second": 0.467
 }

trainer_state.json CHANGED Viewed

The diff for this file is too large to render. See raw diff