ninagroot/Llama-360Mtest

Browse files

Files changed (11) hide show

README.md +10 -42
added_tokens.json +3 -0
merges.txt +0 -0
model.safetensors +1 -1
runs/Apr18_11-29-14_gcn64.local.snellius.surf.nl/events.out.tfevents.1713432565.gcn64.local.snellius.surf.nl.3701576.0 +3 -0
runs/Apr18_11-29-42_gcn29.local.snellius.surf.nl/events.out.tfevents.1713432590.gcn29.local.snellius.surf.nl.424529.0 +3 -0
special_tokens_map.json +6 -0
tokenizer.json +0 -0
tokenizer_config.json +44 -0
training_args.bin +1 -1
vocab.json +0 -0

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 5.6562
 ## Model description
@@ -41,53 +41,21 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 50
-- num_epochs: 40
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 8.3864        | 0.98  | 7    | 8.4700          |
-| 7.2228        | 1.96  | 14   | 7.6525          |
-| 6.2754        | 2.95  | 21   | 6.9672          |
-| 5.444         | 3.93  | 28   | 6.3528          |
-| 4.6376        | 4.91  | 35   | 5.8872          |
-| 3.7271        | 5.89  | 42   | 5.4730          |
-| 3.211         | 6.88  | 49   | 5.2839          |
-| 2.563         | 8.0   | 57   | 5.1826          |
-| 1.9961        | 8.98  | 64   | 5.1621          |
-| 1.4468        | 9.96  | 71   | 5.2455          |
-| 1.0269        | 10.95 | 78   | 5.3081          |
-| 0.7106        | 11.93 | 85   | 5.2484          |
-| 0.4967        | 12.91 | 92   | 5.3469          |
-| 0.3478        | 13.89 | 99   | 5.3402          |
-| 0.2494        | 14.88 | 106  | 5.4144          |
-| 0.1696        | 16.0  | 114  | 5.4190          |
-| 0.1245        | 16.98 | 121  | 5.4780          |
-| 0.0799        | 17.96 | 128  | 5.5194          |
-| 0.0618        | 18.95 | 135  | 5.5302          |
-| 0.0375        | 19.93 | 142  | 5.5205          |
-| 0.032         | 20.91 | 149  | 5.5534          |
-| 0.0275        | 21.89 | 156  | 5.5555          |
-| 0.0218        | 22.88 | 163  | 5.6052          |
-| 0.0196        | 24.0  | 171  | 5.6138          |
-| 0.0203        | 24.98 | 178  | 5.6179          |
-| 0.018         | 25.96 | 185  | 5.6200          |
-| 0.0189        | 26.95 | 192  | 5.6299          |
-| 0.0181        | 27.93 | 199  | 5.6347          |
-| 0.016         | 28.91 | 206  | 5.6402          |
-| 0.018         | 29.89 | 213  | 5.6432          |
-| 0.016         | 30.88 | 220  | 5.6474          |
-| 0.0166        | 32.0  | 228  | 5.6500          |
-| 0.0169        | 32.98 | 235  | 5.6515          |
-| 0.0166        | 33.96 | 242  | 5.6531          |
-| 0.0159        | 34.95 | 249  | 5.6547          |
-| 0.0164        | 35.93 | 256  | 5.6556          |
-| 0.0159        | 36.91 | 263  | 5.6561          |
-| 0.0144        | 37.89 | 270  | 5.6562          |
-| 0.0142        | 38.88 | 277  | 5.6562          |
-| 0.016         | 39.3  | 280  | 5.6562          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 5.1779
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 50
+- num_epochs: 8
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 8.4314        | 0.98  | 7    | 8.5119          |
+| 7.2356        | 1.96  | 14   | 7.6465          |
+| 6.2886        | 2.95  | 21   | 6.9648          |
+| 5.4784        | 3.93  | 28   | 6.3689          |
+| 4.6903        | 4.91  | 35   | 5.8731          |
+| 3.7605        | 5.89  | 42   | 5.4457          |
+| 3.1642        | 6.88  | 49   | 5.2128          |
+| 2.5642        | 7.86  | 56   | 5.1779          |
 ### Framework versions

added_tokens.json ADDED Viewed

	@@ -0,0 +1,3 @@

+{
+  "<|endoftext|>": 12198
+}

merges.txt ADDED Viewed

The diff for this file is too large to render. See raw diff

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5b95ad33b4c61898cfeefbb8d0b7ca9fc5382ef0d8973ded4d712bd15e7caa49
 size 1408774432

 version https://git-lfs.github.com/spec/v1
+oid sha256:01d6d12c843483a0a4db6355a59e50ded6fe4efdd47de1ec7ef6bc007e12c82f
 size 1408774432

runs/Apr18_11-29-14_gcn64.local.snellius.surf.nl/events.out.tfevents.1713432565.gcn64.local.snellius.surf.nl.3701576.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3ff796cf33b2d0dd70fd49288da73f9a8131df27463fb5d065d7c7fa8905c1f7
+size 18629

runs/Apr18_11-29-42_gcn29.local.snellius.surf.nl/events.out.tfevents.1713432590.gcn29.local.snellius.surf.nl.424529.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:570b9f2dfcc75adfe9c8b38453fe7cf0d3bc6960463b4a61c8dd191aa5a9e58f
+size 6276

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,6 @@

+{
+  "bos_token": "<s>",
+  "eos_token": "</s>",
+  "pad_token": "<pad>",
+  "unk_token": "<|endoftext|>"
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,44 @@

+{
+  "add_prefix_space": false,
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "12198": {
+      "content": "<|endoftext|>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": true,
+  "eos_token": "</s>",
+  "model_max_length": 128,
+  "pad_token": "<pad>",
+  "tokenizer_class": "GPT2Tokenizer",
+  "unk_token": "<|endoftext|>"
+}

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b13f80659633a7cbcc67700eb6a8ea06b718482efadb08dd4e23b49b007f61b6
 size 4984

 version https://git-lfs.github.com/spec/v1
+oid sha256:48cba8e54c48f085335f15c64978dc87d647a46edb2b350432a13b072d3fcc1e
 size 4984

vocab.json ADDED Viewed

The diff for this file is too large to render. See raw diff