ninagroot
/

Baby-Llama-58M-RUN1

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.9610
 ## Model description
@@ -46,26 +46,26 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 206.943       | 1.0   | 69   | 166.3778        |
-| 138.4555      | 2.0   | 138  | 104.0753        |
-| 66.034        | 3.0   | 207  | 43.8374         |
-| 27.1436       | 4.0   | 276  | 20.7827         |
-| 13.8693       | 5.0   | 345  | 11.6380         |
-| 9.7743        | 6.0   | 414  | 8.3680          |
-| 6.9283        | 7.0   | 483  | 7.0591          |
-| 6.0877        | 8.0   | 552  | 6.2705          |
-| 5.4184        | 9.0   | 621  | 5.6714          |
-| 4.759         | 10.0  | 690  | 5.4118          |
-| 4.7849        | 11.0  | 759  | 4.9545          |
-| 4.1408        | 12.0  | 828  | 4.7373          |
-| 3.9824        | 13.0  | 897  | 4.4636          |
-| 3.7767        | 14.0  | 966  | 4.2870          |
-| 3.6403        | 15.0  | 1035 | 4.1945          |
-| 3.4008        | 16.0  | 1104 | 4.0799          |
-| 3.4039        | 17.0  | 1173 | 4.0271          |
-| 3.1972        | 18.0  | 1242 | 3.9771          |
-| 3.2312        | 19.0  | 1311 | 3.9653          |
-| 3.2737        | 20.0  | 1380 | 3.9610          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.9707
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 205.7347      | 1.0   | 69   | 164.9866        |
+| 140.7988      | 2.0   | 138  | 105.9197        |
+| 69.569        | 3.0   | 207  | 46.6930         |
+| 28.052        | 4.0   | 276  | 19.8943         |
+| 14.7501       | 5.0   | 345  | 11.8347         |
+| 10.0078       | 6.0   | 414  | 8.8358          |
+| 6.8621        | 7.0   | 483  | 6.8726          |
+| 6.2461        | 8.0   | 552  | 6.4684          |
+| 5.4379        | 9.0   | 621  | 5.6002          |
+| 4.8584        | 10.0  | 690  | 5.3592          |
+| 4.652         | 11.0  | 759  | 5.0464          |
+| 4.2405        | 12.0  | 828  | 4.6742          |
+| 3.9809        | 13.0  | 897  | 4.3925          |
+| 3.7987        | 14.0  | 966  | 4.2740          |
+| 3.6593        | 15.0  | 1035 | 4.1871          |
+| 3.4527        | 16.0  | 1104 | 4.1033          |
+| 3.4028        | 17.0  | 1173 | 4.0354          |
+| 3.2057        | 18.0  | 1242 | 3.9949          |
+| 3.2595        | 19.0  | 1311 | 3.9728          |
+| 3.2917        | 20.0  | 1380 | 3.9707          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:01eb3ea6123ab73ab833a043e5bbd18b75537f98fbf3b8199e01af9c7ba5b5d9
 size 185517896

 version https://git-lfs.github.com/spec/v1
+oid sha256:7e635cde993d1b74d9570f588cc5fb277b168d39bbca59069c890601007f494c
 size 185517896

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:3db52b5e60d07faf1e8ade2c260e9b62dbf50e2875c44732e551e5a9fe8c58ac
 size 4792

 version https://git-lfs.github.com/spec/v1
+oid sha256:4e2f5447f468a9cd43798831fadb6635e45910c2eb3f636c4e6469fef24b0e91
 size 4792