ninagroot
/

Baby-Llama-58M-RUN1

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.9790
 ## Model description
@@ -46,26 +46,26 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 205.6138      | 1.0   | 69   | 165.0209        |
-| 140.5155      | 2.0   | 138  | 108.3043        |
-| 71.1999       | 3.0   | 207  | 48.5052         |
-| 26.2745       | 4.0   | 276  | 20.7425         |
-| 13.2665       | 5.0   | 345  | 12.1278         |
-| 9.6075        | 6.0   | 414  | 8.5203          |
-| 6.7498        | 7.0   | 483  | 6.8251          |
-| 6.1128        | 8.0   | 552  | 6.3816          |
-| 5.5661        | 9.0   | 621  | 5.6632          |
-| 4.8771        | 10.0  | 690  | 5.3599          |
-| 4.6769        | 11.0  | 759  | 5.0607          |
-| 4.2186        | 12.0  | 828  | 4.7932          |
-| 3.9824        | 13.0  | 897  | 4.4486          |
-| 3.8089        | 14.0  | 966  | 4.2846          |
-| 3.6713        | 15.0  | 1035 | 4.1736          |
-| 3.4087        | 16.0  | 1104 | 4.1207          |
-| 3.4155        | 17.0  | 1173 | 4.0485          |
-| 3.2099        | 18.0  | 1242 | 4.0001          |
-| 3.2476        | 19.0  | 1311 | 3.9819          |
-| 3.2992        | 20.0  | 1380 | 3.9790          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.9805
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 208.1439      | 1.0   | 69   | 168.7479        |
+| 137.7666      | 2.0   | 138  | 104.4204        |
+| 64.4054       | 3.0   | 207  | 42.3502         |
+| 26.5661       | 4.0   | 276  | 19.2662         |
+| 14.7544       | 5.0   | 345  | 12.7249         |
+| 10.2813       | 6.0   | 414  | 8.5354          |
+| 6.9142        | 7.0   | 483  | 7.3827          |
+| 6.1554        | 8.0   | 552  | 6.4836          |
+| 5.3557        | 9.0   | 621  | 5.5994          |
+| 4.8551        | 10.0  | 690  | 5.4054          |
+| 4.7462        | 11.0  | 759  | 4.9582          |
+| 4.1657        | 12.0  | 828  | 4.7667          |
+| 4.0338        | 13.0  | 897  | 4.4520          |
+| 3.8436        | 14.0  | 966  | 4.2957          |
+| 3.6859        | 15.0  | 1035 | 4.2060          |
+| 3.4503        | 16.0  | 1104 | 4.0957          |
+| 3.4381        | 17.0  | 1173 | 4.0400          |
+| 3.2315        | 18.0  | 1242 | 4.0068          |
+| 3.2559        | 19.0  | 1311 | 3.9848          |
+| 3.3044        | 20.0  | 1380 | 3.9805          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:6e1af789ab67b927e9857138eaf00c67a67ee0caf68b7ce064d21d7bda7371e9
 size 185517896

 version https://git-lfs.github.com/spec/v1
+oid sha256:9925d491ca799897f70b48245759259ad5d64b36d4b976076a474865f3e75683
 size 185517896

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:583ec0d9fd1108d00756e1c3b69ddbec684f2f03713c12642e1c41db6c04a6f0
 size 4792

 version https://git-lfs.github.com/spec/v1
+oid sha256:d72472d90ccb6353e373291e39e33b128c9244c2d501fc745a44c17b455cf604
 size 4792