dsuwala
/

calculator_model_test

@@ -14,7 +14,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5990
 ## Model description
@@ -40,51 +40,52 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 40
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 3.5011        | 1.0   | 5    | 2.9098          |
-| 2.5679        | 2.0   | 10   | 2.0979          |
-| 1.9533        | 3.0   | 15   | 1.7263          |
-| 1.6643        | 4.0   | 20   | 1.5940          |
-| 1.5773        | 5.0   | 25   | 1.5335          |
-| 1.5508        | 6.0   | 30   | 1.5127          |
-| 1.5290        | 7.0   | 35   | 1.4828          |
-| 1.4819        | 8.0   | 40   | 1.4347          |
-| 1.4571        | 9.0   | 45   | 1.5285          |
-| 1.4630        | 10.0  | 50   | 1.4265          |
-| 1.3942        | 11.0  | 55   | 1.3409          |
-| 1.3506        | 12.0  | 60   | 1.3209          |
-| 1.3223        | 13.0  | 65   | 1.2914          |
-| 1.2856        | 14.0  | 70   | 1.2379          |
-| 1.2325        | 15.0  | 75   | 1.1640          |
-| 1.2030        | 16.0  | 80   | 1.1480          |
-| 1.1457        | 17.0  | 85   | 1.0848          |
-| 1.1101        | 18.0  | 90   | 1.0746          |
-| 1.0606        | 19.0  | 95   | 0.9998          |
-| 1.0178        | 20.0  | 100  | 0.9548          |
-| 0.9794        | 21.0  | 105  | 0.9146          |
-| 0.9373        | 22.0  | 110  | 0.8770          |
-| 0.9062        | 23.0  | 115  | 0.8525          |
-| 0.8708        | 24.0  | 120  | 0.8255          |
-| 0.8462        | 25.0  | 125  | 0.7827          |
-| 0.8156        | 26.0  | 130  | 0.7522          |
-| 0.7904        | 27.0  | 135  | 0.7324          |
-| 0.7911        | 28.0  | 140  | 0.7292          |
-| 0.7757        | 29.0  | 145  | 0.7036          |
-| 0.7602        | 30.0  | 150  | 0.7177          |
-| 0.7413        | 31.0  | 155  | 0.6796          |
-| 0.7222        | 32.0  | 160  | 0.6597          |
-| 0.7103        | 33.0  | 165  | 0.6559          |
-| 0.6977        | 34.0  | 170  | 0.6391          |
-| 0.6959        | 35.0  | 175  | 0.6305          |
-| 0.6853        | 36.0  | 180  | 0.6193          |
-| 0.6737        | 37.0  | 185  | 0.6142          |
-| 0.6696        | 38.0  | 190  | 0.6065          |
-| 0.6666        | 39.0  | 195  | 0.6015          |
-| 0.6598        | 40.0  | 200  | 0.5990          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0681
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 40
+- mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 3.0932        | 1.0   | 6    | 2.3284          |
+| 2.0708        | 2.0   | 12   | 1.7597          |
+| 1.6188        | 3.0   | 18   | 1.4260          |
+| 1.2984        | 4.0   | 24   | 1.1225          |
+| 1.0571        | 5.0   | 30   | 0.9371          |
+| 0.9006        | 6.0   | 36   | 0.8191          |
+| 0.8187        | 7.0   | 42   | 0.7260          |
+| 0.7281        | 8.0   | 48   | 0.6762          |
+| 0.6804        | 9.0   | 54   | 0.6348          |
+| 0.6251        | 10.0  | 60   | 0.5609          |
+| 0.5701        | 11.0  | 66   | 0.5090          |
+| 0.5325        | 12.0  | 72   | 0.4674          |
+| 0.4973        | 13.0  | 78   | 0.4407          |
+| 0.4619        | 14.0  | 84   | 0.4090          |
+| 0.4408        | 15.0  | 90   | 0.3996          |
+| 0.4311        | 16.0  | 96   | 0.4260          |
+| 0.4237        | 17.0  | 102  | 0.3490          |
+| 0.3734        | 18.0  | 108  | 0.3225          |
+| 0.3387        | 19.0  | 114  | 0.2895          |
+| 0.3111        | 20.0  | 120  | 0.2506          |
+| 0.2790        | 21.0  | 126  | 0.2317          |
+| 0.2652        | 22.0  | 132  | 0.2102          |
+| 0.2521        | 23.0  | 138  | 0.1889          |
+| 0.2293        | 24.0  | 144  | 0.1697          |
+| 0.2031        | 25.0  | 150  | 0.1413          |
+| 0.1844        | 26.0  | 156  | 0.1269          |
+| 0.1856        | 27.0  | 162  | 0.1358          |
+| 0.1787        | 28.0  | 168  | 0.1104          |
+| 0.1549        | 29.0  | 174  | 0.1175          |
+| 0.1644        | 30.0  | 180  | 0.1034          |
+| 0.1406        | 31.0  | 186  | 0.0931          |
+| 0.1379        | 32.0  | 192  | 0.0888          |
+| 0.1299        | 33.0  | 198  | 0.0916          |
+| 0.1265        | 34.0  | 204  | 0.0809          |
+| 0.1216        | 35.0  | 210  | 0.0747          |
+| 0.1160        | 36.0  | 216  | 0.0725          |
+| 0.1106        | 37.0  | 222  | 0.0707          |
+| 0.1092        | 38.0  | 228  | 0.0686          |
+| 0.1076        | 39.0  | 234  | 0.0689          |
+| 0.1113        | 40.0  | 240  | 0.0681          |
 ### Framework versions