ka24j13
/

calculator_model_test

@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.1482
 ## Model description
@@ -39,27 +39,97 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- num_epochs: 10
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 1.9124        | 1.0   | 5    | 1.7523          |
-| 1.5949        | 2.0   | 10   | 1.4888          |
-| 1.4837        | 3.0   | 15   | 1.4655          |
-| 1.4517        | 4.0   | 20   | 1.4177          |
-| 1.4030        | 5.0   | 25   | 1.3661          |
-| 1.3564        | 6.0   | 30   | 1.3174          |
-| 1.3069        | 7.0   | 35   | 1.2498          |
-| 1.2406        | 8.0   | 40   | 1.1865          |
-| 1.1917        | 9.0   | 45   | 1.1629          |
-| 1.1738        | 10.0  | 50   | 1.1482          |
 ### Framework versions
 - Transformers 5.0.0
-- Pytorch 2.10.0+cpu
 - Datasets 4.0.0
 - Tokenizers 0.22.2

 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2400
 ## Model description
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 80
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 1.8223        | 1.0   | 6    | 2.6713          |
+| 1.6974        | 2.0   | 12   | 1.2700          |
+| 1.1555        | 3.0   | 18   | 1.0111          |
+| 0.9334        | 4.0   | 24   | 0.9271          |
+| 0.8302        | 5.0   | 30   | 0.8386          |
+| 0.8144        | 6.0   | 36   | 0.6687          |
+| 0.7410        | 7.0   | 42   | 0.8035          |
+| 1.0508        | 8.0   | 48   | 0.7194          |
+| 0.6932        | 9.0   | 54   | 0.6786          |
+| 0.7005        | 10.0  | 60   | 0.6282          |
+| 0.6896        | 11.0  | 66   | 0.7197          |
+| 0.7646        | 12.0  | 72   | 1.0102          |
+| 0.7867        | 13.0  | 78   | 0.7615          |
+| 0.6609        | 14.0  | 84   | 0.5590          |
+| 0.6228        | 15.0  | 90   | 0.5399          |
+| 0.5934        | 16.0  | 96   | 0.6468          |
+| 0.6700        | 17.0  | 102  | 0.9275          |
+| 0.7554        | 18.0  | 108  | 0.5375          |
+| 0.6135        | 19.0  | 114  | 0.4792          |
+| 0.5655        | 20.0  | 120  | 0.5007          |
+| 0.5590        | 21.0  | 126  | 0.4746          |
+| 0.5327        | 22.0  | 132  | 0.5993          |
+| 0.5752        | 23.0  | 138  | 0.4929          |
+| 0.5441        | 24.0  | 144  | 0.5178          |
+| 0.5788        | 25.0  | 150  | 0.6241          |
+| 0.6247        | 26.0  | 156  | 0.4842          |
+| 0.5505        | 27.0  | 162  | 0.4867          |
+| 0.5455        | 28.0  | 168  | 0.4462          |
+| 0.5289        | 29.0  | 174  | 0.5937          |
+| 0.5987        | 30.0  | 180  | 0.6013          |
+| 0.5828        | 31.0  | 186  | 0.5909          |
+| 0.6133        | 32.0  | 192  | 0.4737          |
+| 0.5151        | 33.0  | 198  | 0.5884          |
+| 0.5772        | 34.0  | 204  | 0.4821          |
+| 0.5179        | 35.0  | 210  | 0.4324          |
+| 0.4850        | 36.0  | 216  | 0.4085          |
+| 0.4696        | 37.0  | 222  | 0.4039          |
+| 0.4619        | 38.0  | 228  | 0.5007          |
+| 0.5363        | 39.0  | 234  | 0.5064          |
+| 0.5657        | 40.0  | 240  | 0.4818          |
+| 0.5086        | 41.0  | 246  | 0.4906          |
+| 0.4999        | 42.0  | 252  | 0.5442          |
+| 0.5849        | 43.0  | 258  | 0.3945          |
+| 0.5278        | 44.0  | 264  | 0.4150          |
+| 0.4555        | 45.0  | 270  | 0.3989          |
+| 0.4777        | 46.0  | 276  | 0.4117          |
+| 0.4824        | 47.0  | 282  | 0.3651          |
+| 0.4818        | 48.0  | 288  | 0.3574          |
+| 0.4731        | 49.0  | 294  | 0.3521          |
+| 0.4505        | 50.0  | 300  | 0.3882          |
+| 0.4570        | 51.0  | 306  | 0.3543          |
+| 0.4322        | 52.0  | 312  | 0.3370          |
+| 0.4381        | 53.0  | 318  | 0.3251          |
+| 0.3960        | 54.0  | 324  | 0.3653          |
+| 0.4062        | 55.0  | 330  | 0.3998          |
+| 0.4386        | 56.0  | 336  | 0.3577          |
+| 0.4498        | 57.0  | 342  | 0.3895          |
+| 0.4408        | 58.0  | 348  | 0.3248          |
+| 0.3978        | 59.0  | 354  | 0.3223          |
+| 0.3909        | 60.0  | 360  | 0.3173          |
+| 0.3818        | 61.0  | 366  | 0.2892          |
+| 0.4036        | 62.0  | 372  | 0.2931          |
+| 0.3909        | 63.0  | 378  | 0.2990          |
+| 0.3724        | 64.0  | 384  | 0.2945          |
+| 0.3845        | 65.0  | 390  | 0.3036          |
+| 0.3685        | 66.0  | 396  | 0.3095          |
+| 0.3648        | 67.0  | 402  | 0.3112          |
+| 0.3925        | 68.0  | 408  | 0.2939          |
+| 0.3693        | 69.0  | 414  | 0.2720          |
+| 0.3661        | 70.0  | 420  | 0.2579          |
+| 0.3563        | 71.0  | 426  | 0.2672          |
+| 0.3550        | 72.0  | 432  | 0.2657          |
+| 0.3509        | 73.0  | 438  | 0.2525          |
+| 0.3293        | 74.0  | 444  | 0.2487          |
+| 0.3554        | 75.0  | 450  | 0.2458          |
+| 0.3365        | 76.0  | 456  | 0.2447          |
+| 0.3331        | 77.0  | 462  | 0.2475          |
+| 0.3538        | 78.0  | 468  | 0.2416          |
+| 0.3264        | 79.0  | 474  | 0.2418          |
+| 0.3456        | 80.0  | 480  | 0.2400          |
 ### Framework versions
 - Transformers 5.0.0
+- Pytorch 2.10.0+cu128
 - Datasets 4.0.0
 - Tokenizers 0.22.2