End of training
Browse files- README.md +29 -104
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 14 |
|
| 15 |
This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
|
| 16 |
It achieves the following results on the evaluation set:
|
| 17 |
-
- Loss: 0.
|
| 18 |
|
| 19 |
## Model description
|
| 20 |
|
|
@@ -39,112 +39,37 @@ The following hyperparameters were used during training:
|
|
| 39 |
- seed: 42
|
| 40 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 41 |
- lr_scheduler_type: linear
|
| 42 |
-
- num_epochs:
|
| 43 |
|
| 44 |
### Training results
|
| 45 |
|
| 46 |
-
| Training Loss | Epoch | Step
|
| 47 |
-
|:-------------:|:-----:|:----:|:---------------:|
|
| 48 |
-
| 0.
|
| 49 |
-
| 0.
|
| 50 |
-
| 0.
|
| 51 |
-
| 0.
|
| 52 |
-
| 0.
|
| 53 |
-
| 0.
|
| 54 |
-
| 0.
|
| 55 |
-
| 0.
|
| 56 |
-
| 0.
|
| 57 |
-
| 0.
|
| 58 |
-
| 0.
|
| 59 |
-
| 0.
|
| 60 |
-
| 0.
|
| 61 |
-
| 0.
|
| 62 |
-
| 0.
|
| 63 |
-
| 0.
|
| 64 |
-
| 0.
|
| 65 |
-
| 0.
|
| 66 |
-
| 0.
|
| 67 |
-
| 0.
|
| 68 |
-
| 0.
|
| 69 |
-
| 0.
|
| 70 |
-
| 0.
|
| 71 |
-
| 0.
|
| 72 |
-
| 0.
|
| 73 |
-
| 0.019 | 26.0 | 52 | 0.1203 |
|
| 74 |
-
| 0.0451 | 27.0 | 54 | 0.1404 |
|
| 75 |
-
| 0.0117 | 28.0 | 56 | 0.1622 |
|
| 76 |
-
| 0.0392 | 29.0 | 58 | 0.1449 |
|
| 77 |
-
| 0.0258 | 30.0 | 60 | 0.1155 |
|
| 78 |
-
| 0.0077 | 31.0 | 62 | 0.1328 |
|
| 79 |
-
| 0.0101 | 32.0 | 64 | 0.1560 |
|
| 80 |
-
| 0.0035 | 33.0 | 66 | 0.1782 |
|
| 81 |
-
| 0.0069 | 34.0 | 68 | 0.1546 |
|
| 82 |
-
| 0.0387 | 35.0 | 70 | 0.1098 |
|
| 83 |
-
| 0.0519 | 36.0 | 72 | 0.1074 |
|
| 84 |
-
| 0.0195 | 37.0 | 74 | 0.1232 |
|
| 85 |
-
| 0.0434 | 38.0 | 76 | 0.1861 |
|
| 86 |
-
| 0.0204 | 39.0 | 78 | 0.1761 |
|
| 87 |
-
| 0.0035 | 40.0 | 80 | 0.1410 |
|
| 88 |
-
| 0.0143 | 41.0 | 82 | 0.1214 |
|
| 89 |
-
| 0.025 | 42.0 | 84 | 0.1252 |
|
| 90 |
-
| 0.0299 | 43.0 | 86 | 0.1141 |
|
| 91 |
-
| 0.0093 | 44.0 | 88 | 0.1385 |
|
| 92 |
-
| 0.0436 | 45.0 | 90 | 0.1980 |
|
| 93 |
-
| 0.0267 | 46.0 | 92 | 0.1837 |
|
| 94 |
-
| 0.0074 | 47.0 | 94 | 0.1228 |
|
| 95 |
-
| 0.0332 | 48.0 | 96 | 0.0988 |
|
| 96 |
-
| 0.0408 | 49.0 | 98 | 0.0990 |
|
| 97 |
-
| 0.0336 | 50.0 | 100 | 0.1037 |
|
| 98 |
-
| 0.0053 | 51.0 | 102 | 0.1642 |
|
| 99 |
-
| 0.0684 | 52.0 | 104 | 0.2185 |
|
| 100 |
-
| 0.0595 | 53.0 | 106 | 0.1987 |
|
| 101 |
-
| 0.0131 | 54.0 | 108 | 0.1363 |
|
| 102 |
-
| 0.0222 | 55.0 | 110 | 0.1014 |
|
| 103 |
-
| 0.0364 | 56.0 | 112 | 0.0998 |
|
| 104 |
-
| 0.0208 | 57.0 | 114 | 0.1108 |
|
| 105 |
-
| 0.0124 | 58.0 | 116 | 0.1468 |
|
| 106 |
-
| 0.0025 | 59.0 | 118 | 0.1517 |
|
| 107 |
-
| 0.0026 | 60.0 | 120 | 0.1379 |
|
| 108 |
-
| 0.0231 | 61.0 | 122 | 0.1150 |
|
| 109 |
-
| 0.0307 | 62.0 | 124 | 0.0980 |
|
| 110 |
-
| 0.0265 | 63.0 | 126 | 0.0980 |
|
| 111 |
-
| 0.0273 | 64.0 | 128 | 0.1126 |
|
| 112 |
-
| 0.005 | 65.0 | 130 | 0.1476 |
|
| 113 |
-
| 0.0106 | 66.0 | 132 | 0.1501 |
|
| 114 |
-
| 0.0046 | 67.0 | 134 | 0.1307 |
|
| 115 |
-
| 0.0194 | 68.0 | 136 | 0.1038 |
|
| 116 |
-
| 0.0219 | 69.0 | 138 | 0.0962 |
|
| 117 |
-
| 0.0222 | 70.0 | 140 | 0.1026 |
|
| 118 |
-
| 0.0083 | 71.0 | 142 | 0.1166 |
|
| 119 |
-
| 0.0246 | 72.0 | 144 | 0.1447 |
|
| 120 |
-
| 0.0386 | 73.0 | 146 | 0.1429 |
|
| 121 |
-
| 0.0297 | 74.0 | 148 | 0.1260 |
|
| 122 |
-
| 0.009 | 75.0 | 150 | 0.1061 |
|
| 123 |
-
| 0.0124 | 76.0 | 152 | 0.0980 |
|
| 124 |
-
| 0.0069 | 77.0 | 154 | 0.1042 |
|
| 125 |
-
| 0.0123 | 78.0 | 156 | 0.1096 |
|
| 126 |
-
| 0.0209 | 79.0 | 158 | 0.1221 |
|
| 127 |
-
| 0.0169 | 80.0 | 160 | 0.1210 |
|
| 128 |
-
| 0.0027 | 81.0 | 162 | 0.1101 |
|
| 129 |
-
| 0.0141 | 82.0 | 164 | 0.1073 |
|
| 130 |
-
| 0.0132 | 83.0 | 166 | 0.1051 |
|
| 131 |
-
| 0.0069 | 84.0 | 168 | 0.0983 |
|
| 132 |
-
| 0.0203 | 85.0 | 170 | 0.1027 |
|
| 133 |
-
| 0.0056 | 86.0 | 172 | 0.1063 |
|
| 134 |
-
| 0.0078 | 87.0 | 174 | 0.1102 |
|
| 135 |
-
| 0.0139 | 88.0 | 176 | 0.1143 |
|
| 136 |
-
| 0.0114 | 89.0 | 178 | 0.1109 |
|
| 137 |
-
| 0.0046 | 90.0 | 180 | 0.1053 |
|
| 138 |
-
| 0.0072 | 91.0 | 182 | 0.1040 |
|
| 139 |
-
| 0.0075 | 92.0 | 184 | 0.1060 |
|
| 140 |
-
| 0.0055 | 93.0 | 186 | 0.1051 |
|
| 141 |
-
| 0.0019 | 94.0 | 188 | 0.1074 |
|
| 142 |
-
| 0.0035 | 95.0 | 190 | 0.1107 |
|
| 143 |
-
| 0.0037 | 96.0 | 192 | 0.1118 |
|
| 144 |
-
| 0.0125 | 97.0 | 194 | 0.1134 |
|
| 145 |
-
| 0.009 | 98.0 | 196 | 0.1125 |
|
| 146 |
-
| 0.005 | 99.0 | 198 | 0.1112 |
|
| 147 |
-
| 0.0033 | 100.0 | 200 | 0.1108 |
|
| 148 |
|
| 149 |
|
| 150 |
### Framework versions
|
|
|
|
| 14 |
|
| 15 |
This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
|
| 16 |
It achieves the following results on the evaluation set:
|
| 17 |
+
- Loss: 0.0537
|
| 18 |
|
| 19 |
## Model description
|
| 20 |
|
|
|
|
| 39 |
- seed: 42
|
| 40 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 41 |
- lr_scheduler_type: linear
|
| 42 |
+
- num_epochs: 25
|
| 43 |
|
| 44 |
### Training results
|
| 45 |
|
| 46 |
+
| Training Loss | Epoch | Step | Validation Loss |
|
| 47 |
+
|:-------------:|:-----:|:------:|:---------------:|
|
| 48 |
+
| 0.1919 | 1.0 | 4350 | 0.0718 |
|
| 49 |
+
| 0.0072 | 2.0 | 8700 | 0.0734 |
|
| 50 |
+
| 0.0009 | 3.0 | 13050 | 0.0603 |
|
| 51 |
+
| 0.049 | 4.0 | 17400 | 0.0770 |
|
| 52 |
+
| 0.0496 | 5.0 | 21750 | 0.0554 |
|
| 53 |
+
| 0.001 | 6.0 | 26100 | 0.0565 |
|
| 54 |
+
| 0.0027 | 7.0 | 30450 | 0.0561 |
|
| 55 |
+
| 0.0041 | 8.0 | 34800 | 0.0607 |
|
| 56 |
+
| 0.0273 | 9.0 | 39150 | 0.0565 |
|
| 57 |
+
| 0.0344 | 10.0 | 43500 | 0.0580 |
|
| 58 |
+
| 0.0246 | 11.0 | 47850 | 0.0557 |
|
| 59 |
+
| 0.0187 | 12.0 | 52200 | 0.0624 |
|
| 60 |
+
| 0.0828 | 13.0 | 56550 | 0.0523 |
|
| 61 |
+
| 0.059 | 14.0 | 60900 | 0.0537 |
|
| 62 |
+
| 0.2687 | 15.0 | 65250 | 0.0561 |
|
| 63 |
+
| 0.0593 | 16.0 | 69600 | 0.0565 |
|
| 64 |
+
| 0.0015 | 17.0 | 73950 | 0.0541 |
|
| 65 |
+
| 0.0023 | 18.0 | 78300 | 0.0558 |
|
| 66 |
+
| 0.0001 | 19.0 | 82650 | 0.0532 |
|
| 67 |
+
| 0.0026 | 20.0 | 87000 | 0.0547 |
|
| 68 |
+
| 0.0339 | 21.0 | 91350 | 0.0543 |
|
| 69 |
+
| 0.001 | 22.0 | 95700 | 0.0567 |
|
| 70 |
+
| 0.0553 | 23.0 | 100050 | 0.0545 |
|
| 71 |
+
| 0.0038 | 24.0 | 104400 | 0.0536 |
|
| 72 |
+
| 0.012 | 25.0 | 108750 | 0.0537 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 73 |
|
| 74 |
|
| 75 |
### Framework versions
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 11213056
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:71e4e0f44a3581c024ffce11a2f776b97dd399603f1801b43ffc195ae4fb8054
|
| 3 |
size 11213056
|