ninagroot/Llama-360Mtest

Files changed (7) hide show

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 4.1915
 ## Model description
@@ -33,31 +33,25 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
-- train_batch_size: 1
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 8
-- total_train_batch_size: 8
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 300
-- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 7.042         | 0.99  | 44   | 6.5393          |
-| 5.6951        | 1.99  | 88   | 5.4631          |
-| 4.7481        | 2.98  | 132  | 4.6474          |
-| 4.1761        | 4.0   | 177  | 4.4405          |
-| 3.6565        | 4.99  | 221  | 4.3291          |
-| 3.5648        | 5.99  | 265  | 4.2850          |
-| 3.3644        | 6.98  | 309  | 4.2276          |
-| 3.1299        | 8.0   | 354  | 4.2050          |
-| 2.5705        | 8.99  | 398  | 4.2062          |
-| 2.1843        | 9.94  | 440  | 4.1915          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 8.3245
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
+- train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 8
+- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 300
+- num_epochs: 4
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| No log        | 0.89  | 2    | 8.5737          |
+| No log        | 1.78  | 4    | 8.5252          |
+| No log        | 2.67  | 6    | 8.4412          |
+| No log        | 3.56  | 8    | 8.3245          |
 ### Framework versions

config.json CHANGED Viewed

@@ -10,7 +10,7 @@
   "hidden_size": 1024,
   "initializer_range": 0.02,
   "intermediate_size": 3072,
-  "max_position_embeddings": 200,
   "model_type": "llama",
   "num_attention_heads": 8,
   "num_hidden_layers": 24,

   "hidden_size": 1024,
   "initializer_range": 0.02,
   "intermediate_size": 3072,
+  "max_position_embeddings": 256,
   "model_type": "llama",
   "num_attention_heads": 8,
   "num_hidden_layers": 24,

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:75005945ce406a49db1b8b4b9aef520f33dd864987fe79a782192a532bf8d76b
 size 1344172280

 version https://git-lfs.github.com/spec/v1
+oid sha256:c4e7e2fdbdb2c9d26bec5fe70b1e741eb30888fa00a93c82d8fd5fdbeb7c94a1
 size 1344172280

runs/Mar22_11-42-50_gcn28.local.snellius.surf.nl/events.out.tfevents.1711104184.gcn28.local.snellius.surf.nl.3104221.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:d9b34146abf395d823ae4cda1a6ec400ec4651ac7209ac60ffa9413f01c06747
+size 5731

runs/Mar22_11-44-51_gcn14.local.snellius.surf.nl/events.out.tfevents.1711104302.gcn14.local.snellius.surf.nl.856670.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:578eb156726fc0e7e7211b9c6e5b6e5a6ba390bfdcc60d66a2fc502d9fc13e17
+size 5731

tokenizer_config.json CHANGED Viewed

@@ -37,7 +37,7 @@
   "bos_token": "<s>",
   "clean_up_tokenization_spaces": true,
   "eos_token": "</s>",
-  "model_max_length": 100,
   "pad_token": "<pad>",
   "tokenizer_class": "GPT2Tokenizer",
   "unk_token": "<|endoftext|>"

   "bos_token": "<s>",
   "clean_up_tokenization_spaces": true,
   "eos_token": "</s>",
+  "model_max_length": 128,
   "pad_token": "<pad>",
   "tokenizer_class": "GPT2Tokenizer",
   "unk_token": "<|endoftext|>"

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a8087606d28ebab0c1abebab447a0ae1e9c17fda4ef368b96a83e8f2e6950e66
 size 4728

 version https://git-lfs.github.com/spec/v1
+oid sha256:e3179a2a3fe68b86253b0ba9c42f796efa0b7ead1164a0e56535abe8e14039e7
 size 4728