ninagroot
/

Baby-Llama-58M-RUN1

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 12.3739
 ## Model description
@@ -46,46 +46,46 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 81.6769       | 1.0   | 2    | 74.8633         |
-| 81.1451       | 2.0   | 4    | 73.7956         |
-| 78.7609       | 3.0   | 6    | 71.9935         |
-| 79.1356       | 4.0   | 8    | 69.4148         |
-| 75.726        | 5.0   | 10   | 67.8374         |
-| 74.2979       | 6.0   | 12   | 64.3771         |
-| 70.3903       | 7.0   | 14   | 61.1100         |
-| 67.5033       | 8.0   | 16   | 58.1597         |
-| 64.8955       | 9.0   | 18   | 55.2518         |
-| 61.2792       | 10.0  | 20   | 52.1664         |
-| 57.5665       | 11.0  | 22   | 48.9584         |
-| 54.0972       | 12.0  | 24   | 45.8081         |
-| 50.2098       | 13.0  | 26   | 42.8455         |
-| 48.9371       | 14.0  | 28   | 40.1582         |
-| 45.2235       | 15.0  | 30   | 37.7302         |
-| 44.1405       | 16.0  | 32   | 35.5237         |
-| 41.0789       | 17.0  | 34   | 33.5662         |
-| 40.2006       | 18.0  | 36   | 31.8106         |
-| 38.5898       | 19.0  | 38   | 30.1508         |
-| 36.2422       | 20.0  | 40   | 28.5076         |
-| 34.6463       | 21.0  | 42   | 26.5191         |
-| 30.7565       | 22.0  | 44   | 24.9482         |
-| 29.6666       | 23.0  | 46   | 23.8793         |
-| 27.6733       | 24.0  | 48   | 22.8973         |
-| 25.9126       | 25.0  | 50   | 21.6442         |
-| 25.2859       | 26.0  | 52   | 20.4439         |
-| 24.0265       | 27.0  | 54   | 19.7371         |
-| 21.8765       | 28.0  | 56   | 18.4843         |
-| 20.4426       | 29.0  | 58   | 17.2997         |
-| 18.7842       | 30.0  | 60   | 16.1685         |
-| 17.7504       | 31.0  | 62   | 15.4688         |
-| 16.5791       | 32.0  | 64   | 15.0343         |
-| 16.1571       | 33.0  | 66   | 14.1040         |
-| 15.0651       | 34.0  | 68   | 13.7322         |
-| 14.0418       | 35.0  | 70   | 13.2421         |
-| 13.6841       | 36.0  | 72   | 12.8765         |
-| 13.3316       | 37.0  | 74   | 12.5740         |
-| 13.3591       | 38.0  | 76   | 12.5028         |
-| 13.0756       | 39.0  | 78   | 12.4223         |
-| 13.0233       | 40.0  | 80   | 12.3739         |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 12.7407
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 81.6068       | 1.0   | 2    | 75.8719         |
+| 81.1403       | 2.0   | 4    | 74.6429         |
+| 78.4746       | 3.0   | 6    | 72.5604         |
+| 78.6147       | 4.0   | 8    | 69.6859         |
+| 75.1485       | 5.0   | 10   | 67.9944         |
+| 73.5182       | 6.0   | 12   | 64.5075         |
+| 69.6393       | 7.0   | 14   | 61.3852         |
+| 66.9895       | 8.0   | 16   | 58.5262         |
+| 64.4746       | 9.0   | 18   | 55.6940         |
+| 60.8097       | 10.0  | 20   | 52.6993         |
+| 57.1714       | 11.0  | 22   | 49.5786         |
+| 53.8474       | 12.0  | 24   | 46.5081         |
+| 49.9873       | 13.0  | 26   | 43.6358         |
+| 48.7366       | 14.0  | 28   | 41.0406         |
+| 45.0539       | 15.0  | 30   | 38.7263         |
+| 44.0504       | 16.0  | 32   | 36.6352         |
+| 40.9533       | 17.0  | 34   | 34.6685         |
+| 39.9931       | 18.0  | 36   | 32.7875         |
+| 38.116        | 19.0  | 38   | 30.8567         |
+| 35.4181       | 20.0  | 40   | 28.9705         |
+| 34.0383       | 21.0  | 42   | 27.4282         |
+| 30.7991       | 22.0  | 44   | 26.4171         |
+| 29.8348       | 23.0  | 46   | 24.9225         |
+| 27.9282       | 24.0  | 48   | 23.9103         |
+| 25.8511       | 25.0  | 50   | 22.9495         |
+| 25.1711       | 26.0  | 52   | 21.5530         |
+| 24.2361       | 27.0  | 54   | 20.5871         |
+| 21.9294       | 28.0  | 56   | 19.0727         |
+| 20.435        | 29.0  | 58   | 18.0482         |
+| 18.682        | 30.0  | 60   | 17.0037         |
+| 17.4144       | 31.0  | 62   | 16.0468         |
+| 16.4872       | 32.0  | 64   | 15.2828         |
+| 16.2417       | 33.0  | 66   | 14.6359         |
+| 15.1244       | 34.0  | 68   | 14.1234         |
+| 14.0602       | 35.0  | 70   | 13.5799         |
+| 13.7722       | 36.0  | 72   | 13.3509         |
+| 13.377        | 37.0  | 74   | 12.9960         |
+| 13.4091       | 38.0  | 76   | 12.8183         |
+| 13.1398       | 39.0  | 78   | 12.7614         |
+| 13.1002       | 40.0  | 80   | 12.7407         |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6a1714b9a7d37f2a284c5fd482259328e6821ab9e76ec376a2a7644c9d1cf168
 size 217819016

 version https://git-lfs.github.com/spec/v1
+oid sha256:b95b266c906f8a10c9a06b7ef00f57ba328d35b02576511cc7d51b8ef129717f
 size 217819016

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0551b202cc608f1cc3f396331916badd2c01ef6cd77ea9422de79e6fdc202a56
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:6d49873d6da6e46a5804e20834fe83e4d20fbe84a83bb526ea7410b0ff377414
 size 4984