ninagroot
/

Baby-Llama-58M-RUN1

@@ -4,7 +4,6 @@ tags:
 model-index:
 - name: Baby-Llama-58M
   results: []
-pipeline_tag: text-generation
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -14,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.9707
 ## Model description
@@ -47,26 +46,26 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 205.7347      | 1.0   | 69   | 164.9866        |
-| 140.7988      | 2.0   | 138  | 105.9197        |
-| 69.569        | 3.0   | 207  | 46.6930         |
-| 28.052        | 4.0   | 276  | 19.8943         |
-| 14.7501       | 5.0   | 345  | 11.8347         |
-| 10.0078       | 6.0   | 414  | 8.8358          |
-| 6.8621        | 7.0   | 483  | 6.8726          |
-| 6.2461        | 8.0   | 552  | 6.4684          |
-| 5.4379        | 9.0   | 621  | 5.6002          |
-| 4.8584        | 10.0  | 690  | 5.3592          |
-| 4.652         | 11.0  | 759  | 5.0464          |
-| 4.2405        | 12.0  | 828  | 4.6742          |
-| 3.9809        | 13.0  | 897  | 4.3925          |
-| 3.7987        | 14.0  | 966  | 4.2740          |
-| 3.6593        | 15.0  | 1035 | 4.1871          |
-| 3.4527        | 16.0  | 1104 | 4.1033          |
-| 3.4028        | 17.0  | 1173 | 4.0354          |
-| 3.2057        | 18.0  | 1242 | 3.9949          |
-| 3.2595        | 19.0  | 1311 | 3.9728          |
-| 3.2917        | 20.0  | 1380 | 3.9707          |
 ### Framework versions
@@ -74,4 +73,4 @@ The following hyperparameters were used during training:
 - Transformers 4.37.2
 - Pytorch 2.1.2+cu121
 - Datasets 2.16.1
-- Tokenizers 0.15.0

 model-index:
 - name: Baby-Llama-58M
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.9569
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 207.4139      | 1.0   | 69   | 168.7495        |
+| 140.1234      | 2.0   | 138  | 105.7544        |
+| 65.5354       | 3.0   | 207  | 45.8237         |
+| 25.9459       | 4.0   | 276  | 19.2743         |
+| 14.1729       | 5.0   | 345  | 11.7973         |
+| 9.9299        | 6.0   | 414  | 8.2180          |
+| 6.8093        | 7.0   | 483  | 6.8497          |
+| 6.1741        | 8.0   | 552  | 6.4197          |
+| 5.4877        | 9.0   | 621  | 5.6851          |
+| 4.7765        | 10.0  | 690  | 5.4365          |
+| 4.6208        | 11.0  | 759  | 5.0201          |
+| 4.146         | 12.0  | 828  | 4.8232          |
+| 3.9427        | 13.0  | 897  | 4.4196          |
+| 3.746         | 14.0  | 966  | 4.2562          |
+| 3.6516        | 15.0  | 1035 | 4.1581          |
+| 3.4029        | 16.0  | 1104 | 4.0782          |
+| 3.3875        | 17.0  | 1173 | 4.0212          |
+| 3.1863        | 18.0  | 1242 | 3.9801          |
+| 3.2367        | 19.0  | 1311 | 3.9602          |
+| 3.2766        | 20.0  | 1380 | 3.9569          |
 ### Framework versions
 - Transformers 4.37.2
 - Pytorch 2.1.2+cu121
 - Datasets 2.16.1
+- Tokenizers 0.15.0

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7e635cde993d1b74d9570f588cc5fb277b168d39bbca59069c890601007f494c
 size 185517896

 version https://git-lfs.github.com/spec/v1
+oid sha256:30f1bd31251eacb162c62b163981446fb3b2d4eaaa6053668d1b918b7b5fa5a0
 size 185517896

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4e2f5447f468a9cd43798831fadb6635e45910c2eb3f636c4e6469fef24b0e91
 size 4792

 version https://git-lfs.github.com/spec/v1
+oid sha256:8e1383968255addb16ba6b2f8442175b7cab5a71cfa956f933a3f957418af4f0
 size 4792