ninagroot/Llama-360Mtest

Files changed (4) hide show

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 8.3245
 ## Model description
@@ -33,14 +33,14 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
-- train_batch_size: 16
 - eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 8
-- total_train_batch_size: 128
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 300
 - num_epochs: 4
 - mixed_precision_training: Native AMP
@@ -48,10 +48,10 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| No log        | 0.89  | 2    | 8.5737          |
-| No log        | 1.78  | 4    | 8.5252          |
-| No log        | 2.67  | 6    | 8.4412          |
-| No log        | 3.56  | 8    | 8.3245          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.2830
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
+- train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
+- lr_scheduler_warmup_steps: 100
 - num_epochs: 4
 - mixed_precision_training: Native AMP
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 6.0238        | 1.0   | 69   | 5.9202          |
+| 5.2632        | 2.0   | 138  | 4.8940          |
+| 4.012         | 3.0   | 207  | 4.3873          |
+| 3.7681        | 4.0   | 276  | 4.2830          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c4e7e2fdbdb2c9d26bec5fe70b1e741eb30888fa00a93c82d8fd5fdbeb7c94a1
 size 1344172280

 version https://git-lfs.github.com/spec/v1
+oid sha256:17e02de6cf127178a8946722392615d442e0df7968a1e9ab13a0d4d88d6e43af
 size 1344172280

runs/Mar22_11-50-54_gcn28.local.snellius.surf.nl/events.out.tfevents.1711104666.gcn28.local.snellius.surf.nl.3128590.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:1d9334804460e1e285b9578397974be082570a28417e61320192816c23547e35
+size 7774

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e3179a2a3fe68b86253b0ba9c42f796efa0b7ead1164a0e56535abe8e14039e7
 size 4728

 version https://git-lfs.github.com/spec/v1
+oid sha256:73705c12fe404124335ca72adf269875cc3fb90a1d3dea5008ffa660c0f4f512
 size 4728