quantumLeopard
/

calculator_model_test

@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0081
 ## Model description
@@ -40,51 +40,52 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 40
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 2.3082        | 1.0   | 13   | 1.5231          |
-| 1.1642        | 2.0   | 26   | 0.6529          |
-| 0.5406        | 3.0   | 39   | 0.4543          |
-| 0.4265        | 4.0   | 52   | 0.3722          |
-| 0.3552        | 5.0   | 65   | 0.3185          |
-| 0.3062        | 6.0   | 78   | 0.2691          |
-| 0.2728        | 7.0   | 91   | 0.2339          |
-| 0.2413        | 8.0   | 104  | 0.2233          |
-| 0.2198        | 9.0   | 117  | 0.1933          |
-| 0.1998        | 10.0  | 130  | 0.1740          |
-| 0.1837        | 11.0  | 143  | 0.1533          |
-| 0.1708        | 12.0  | 156  | 0.1555          |
-| 0.1567        | 13.0  | 169  | 0.1211          |
-| 0.1362        | 14.0  | 182  | 0.1083          |
-| 0.1219        | 15.0  | 195  | 0.0929          |
-| 0.1065        | 16.0  | 208  | 0.0694          |
-| 0.0861        | 17.0  | 221  | 0.0588          |
-| 0.0690        | 18.0  | 234  | 0.0387          |
-| 0.0567        | 19.0  | 247  | 0.0314          |
-| 0.0504        | 20.0  | 260  | 0.0283          |
-| 0.0448        | 21.0  | 273  | 0.0296          |
-| 0.0388        | 22.0  | 286  | 0.0217          |
-| 0.0339        | 23.0  | 299  | 0.0192          |
-| 0.0304        | 24.0  | 312  | 0.0181          |
-| 0.0294        | 25.0  | 325  | 0.0158          |
-| 0.0279        | 26.0  | 338  | 0.0138          |
-| 0.0249        | 27.0  | 351  | 0.0134          |
-| 0.0234        | 28.0  | 364  | 0.0133          |
-| 0.0219        | 29.0  | 377  | 0.0113          |
-| 0.0204        | 30.0  | 390  | 0.0108          |
-| 0.0191        | 31.0  | 403  | 0.0100          |
-| 0.0181        | 32.0  | 416  | 0.0097          |
-| 0.0179        | 33.0  | 429  | 0.0097          |
-| 0.0169        | 34.0  | 442  | 0.0092          |
-| 0.0166        | 35.0  | 455  | 0.0085          |
-| 0.0161        | 36.0  | 468  | 0.0086          |
-| 0.0156        | 37.0  | 481  | 0.0082          |
-| 0.0155        | 38.0  | 494  | 0.0081          |
-| 0.0149        | 39.0  | 507  | 0.0081          |
-| 0.0149        | 40.0  | 520  | 0.0081          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0079
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 40
+- mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 2.3081        | 1.0   | 13   | 1.5233          |
+| 1.1667        | 2.0   | 26   | 0.6557          |
+| 0.5408        | 3.0   | 39   | 0.4523          |
+| 0.4243        | 4.0   | 52   | 0.3792          |
+| 0.3582        | 5.0   | 65   | 0.3195          |
+| 0.3071        | 6.0   | 78   | 0.2719          |
+| 0.2676        | 7.0   | 91   | 0.2468          |
+| 0.2490        | 8.0   | 104  | 0.2108          |
+| 0.2189        | 9.0   | 117  | 0.1874          |
+| 0.1949        | 10.0  | 130  | 0.1616          |
+| 0.1875        | 11.0  | 143  | 0.1690          |
+| 0.1665        | 12.0  | 156  | 0.1327          |
+| 0.1458        | 13.0  | 169  | 0.1161          |
+| 0.1285        | 14.0  | 182  | 0.1057          |
+| 0.1146        | 15.0  | 195  | 0.0887          |
+| 0.1019        | 16.0  | 208  | 0.0703          |
+| 0.0837        | 17.0  | 221  | 0.0496          |
+| 0.0697        | 18.0  | 234  | 0.0432          |
+| 0.0585        | 19.0  | 247  | 0.0365          |
+| 0.0496        | 20.0  | 260  | 0.0258          |
+| 0.0427        | 21.0  | 273  | 0.0254          |
+| 0.0391        | 22.0  | 286  | 0.0223          |
+| 0.0344        | 23.0  | 299  | 0.0216          |
+| 0.0318        | 24.0  | 312  | 0.0167          |
+| 0.0284        | 25.0  | 325  | 0.0155          |
+| 0.0263        | 26.0  | 338  | 0.0139          |
+| 0.0246        | 27.0  | 351  | 0.0126          |
+| 0.0230        | 28.0  | 364  | 0.0128          |
+| 0.0214        | 29.0  | 377  | 0.0115          |
+| 0.0206        | 30.0  | 390  | 0.0103          |
+| 0.0194        | 31.0  | 403  | 0.0096          |
+| 0.0184        | 32.0  | 416  | 0.0098          |
+| 0.0180        | 33.0  | 429  | 0.0104          |
+| 0.0171        | 34.0  | 442  | 0.0089          |
+| 0.0163        | 35.0  | 455  | 0.0084          |
+| 0.0158        | 36.0  | 468  | 0.0082          |
+| 0.0155        | 37.0  | 481  | 0.0081          |
+| 0.0150        | 38.0  | 494  | 0.0081          |
+| 0.0147        | 39.0  | 507  | 0.0080          |
+| 0.0148        | 40.0  | 520  | 0.0079          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f06118d1dd00376438b2dde149e4bf3352a4662309577c8b38a092eb11d30355
 size 31232228

 version https://git-lfs.github.com/spec/v1
+oid sha256:58632e0b5c962252ccded59a5ce92b5629644a041ec1d5da1ac3797791d0764a
 size 31232228