ninagroot/Llama-360Mtest

Files changed (4) hide show

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 4.3046
 ## Model description
@@ -33,33 +33,27 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
-- train_batch_size: 2
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 2
-- total_train_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 100
-- num_epochs: 12
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 6.0662        | 1.0   | 69   | 5.8781          |
-| 5.1419        | 2.0   | 138  | 4.9025          |
-| 4.1271        | 3.0   | 207  | 4.4935          |
-| 3.8908        | 4.0   | 276  | 4.3523          |
-| 3.5293        | 5.0   | 345  | 4.2722          |
-| 3.322         | 6.0   | 414  | 4.2443          |
-| 2.8975        | 7.0   | 483  | 4.2451          |
-| 2.6264        | 8.0   | 552  | 4.2609          |
-| 2.346         | 9.0   | 621  | 4.2915          |
-| 1.9401        | 10.0  | 690  | 4.2793          |
-| 1.7366        | 11.0  | 759  | 4.3004          |
-| 1.676         | 12.0  | 828  | 4.3046          |
 ### Framework versions

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.1886
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
+- train_batch_size: 1
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 2
+- total_train_batch_size: 2
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 100
+- num_epochs: 6
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 5.9345        | 1.0   | 138  | 5.6878          |
+| 4.7674        | 2.0   | 276  | 4.7003          |
+| 3.6914        | 3.0   | 414  | 4.3374          |
+| 3.6076        | 4.0   | 552  | 4.2433          |
+| 3.3436        | 5.0   | 690  | 4.1851          |
+| 2.939         | 6.0   | 828  | 4.1886          |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4486339afec1cf614855ff7c7c981cd6604026e99f59f47a8c5557e39d874a23
 size 1344172280

 version https://git-lfs.github.com/spec/v1
+oid sha256:7b4bcbb5356dde8a15329ab11495572f2d66bb3dabc8f7a980ddf651d1e4cd90
 size 1344172280

runs/Mar25_10-22-22_gcn31.local.snellius.surf.nl/events.out.tfevents.1711358551.gcn31.local.snellius.surf.nl.1845946.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:1d132b8483ea35d0f6081ace981d6bff4b3719e3ac003ae641a3396eb80bbce2
+size 12717

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:92d7d7ba48657e43574e3d1deedfdd10bdc5b34e7b372b3aec112b1d0ce75abd
 size 4728

 version https://git-lfs.github.com/spec/v1
+oid sha256:95beafee183dd8c0f8967da1e54d6cb38ab5fe23f886d744fcaba66038c60105
 size 4728