ninagroot
/

Baby-Llama-58M-RUN1

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.9569
 ## Model description
@@ -46,26 +46,26 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 207.4139      | 1.0   | 69   | 168.7495        |
-| 140.1234      | 2.0   | 138  | 105.7544        |
-| 65.5354       | 3.0   | 207  | 45.8237         |
-| 25.9459       | 4.0   | 276  | 19.2743         |
-| 14.1729       | 5.0   | 345  | 11.7973         |
-| 9.9299        | 6.0   | 414  | 8.2180          |
-| 6.8093        | 7.0   | 483  | 6.8497          |
-| 6.1741        | 8.0   | 552  | 6.4197          |
-| 5.4877        | 9.0   | 621  | 5.6851          |
-| 4.7765        | 10.0  | 690  | 5.4365          |
-| 4.6208        | 11.0  | 759  | 5.0201          |
-| 4.146         | 12.0  | 828  | 4.8232          |
-| 3.9427        | 13.0  | 897  | 4.4196          |
-| 3.746         | 14.0  | 966  | 4.2562          |
-| 3.6516        | 15.0  | 1035 | 4.1581          |
-| 3.4029        | 16.0  | 1104 | 4.0782          |
-| 3.3875        | 17.0  | 1173 | 4.0212          |
-| 3.1863        | 18.0  | 1242 | 3.9801          |
-| 3.2367        | 19.0  | 1311 | 3.9602          |
-| 3.2766        | 20.0  | 1380 | 3.9569          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.9790
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 205.6138      | 1.0   | 69   | 165.0209        |
+| 140.5155      | 2.0   | 138  | 108.3043        |
+| 71.1999       | 3.0   | 207  | 48.5052         |
+| 26.2745       | 4.0   | 276  | 20.7425         |
+| 13.2665       | 5.0   | 345  | 12.1278         |
+| 9.6075        | 6.0   | 414  | 8.5203          |
+| 6.7498        | 7.0   | 483  | 6.8251          |
+| 6.1128        | 8.0   | 552  | 6.3816          |
+| 5.5661        | 9.0   | 621  | 5.6632          |
+| 4.8771        | 10.0  | 690  | 5.3599          |
+| 4.6769        | 11.0  | 759  | 5.0607          |
+| 4.2186        | 12.0  | 828  | 4.7932          |
+| 3.9824        | 13.0  | 897  | 4.4486          |
+| 3.8089        | 14.0  | 966  | 4.2846          |
+| 3.6713        | 15.0  | 1035 | 4.1736          |
+| 3.4087        | 16.0  | 1104 | 4.1207          |
+| 3.4155        | 17.0  | 1173 | 4.0485          |
+| 3.2099        | 18.0  | 1242 | 4.0001          |
+| 3.2476        | 19.0  | 1311 | 3.9819          |
+| 3.2992        | 20.0  | 1380 | 3.9790          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:30f1bd31251eacb162c62b163981446fb3b2d4eaaa6053668d1b918b7b5fa5a0
 size 185517896

 version https://git-lfs.github.com/spec/v1
+oid sha256:6e1af789ab67b927e9857138eaf00c67a67ee0caf68b7ce064d21d7bda7371e9
 size 185517896

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8e1383968255addb16ba6b2f8442175b7cab5a71cfa956f933a3f957418af4f0
 size 4792

 version https://git-lfs.github.com/spec/v1
+oid sha256:583ec0d9fd1108d00756e1c3b69ddbec684f2f03713c12642e1c41db6c04a6f0
 size 4792