ninagroot/Llama-360Mtest

Files changed (5) hide show

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.4045
 ## Model description
@@ -48,14 +48,14 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 7.4576        | 0.99  | 59   | 5.4054          |
-| 3.927         | 2.0   | 119  | 3.9632          |
-| 2.8202        | 2.99  | 178  | 3.6557          |
-| 1.9108        | 4.0   | 238  | 3.4939          |
-| 1.3277        | 4.99  | 297  | 3.4565          |
-| 0.8717        | 6.0   | 357  | 3.4181          |
-| 0.5598        | 6.99  | 416  | 3.4027          |
-| 0.4604        | 7.93  | 472  | 3.4045          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.8269
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 2.9655        | 1.0   | 254  | 3.7658          |
+| 1.7363        | 2.0   | 509  | 3.5948          |
+| 0.9884        | 3.0   | 763  | 3.6899          |
+| 0.6078        | 4.0   | 1018 | 3.7297          |
+| 0.3566        | 5.0   | 1272 | 3.7598          |
+| 0.2182        | 6.0   | 1527 | 3.8006          |
+| 0.1633        | 7.0   | 1781 | 3.8219          |
+| 0.1284        | 7.98  | 2032 | 3.8269          |
 ### Framework versions

config.json CHANGED Viewed

@@ -10,7 +10,7 @@
   "hidden_size": 1024,
   "initializer_range": 0.02,
   "intermediate_size": 3072,
-  "max_position_embeddings": 256,
   "model_type": "llama",
   "num_attention_heads": 8,
   "num_hidden_layers": 24,

   "hidden_size": 1024,
   "initializer_range": 0.02,
   "intermediate_size": 3072,
+  "max_position_embeddings": 60,
   "model_type": "llama",
   "num_attention_heads": 8,
   "num_hidden_layers": 24,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b419164dc271226dca8b2eb7ddfda6791f97b0dad4f38b7505bd781adfee7dd9
 size 1570992472

 version https://git-lfs.github.com/spec/v1
+oid sha256:dd82abd3133a184ba4773b3c6a75792665145cf367d5308fd529d07d7da1e95c
 size 1570992472

runs/Apr02_15-38-36_gcn31.local.snellius.surf.nl/events.out.tfevents.1712065125.gcn31.local.snellius.surf.nl.1031690.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:8c018d3b09d6e0ad22d7c9740d885b02610070f3f51d08c8a1741bdc0717eb07
+size 28371

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2dff12963512f842bbd1b0d3dd9a75580e2b1d112c698330839007a214f43af6
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:6e192133b205882b76e8d67aa6c28e4a7aece21d9df363db42b30a69b6cd47b7
 size 4984