ninagroot/Llama-360Mtest

Files changed (6) hide show

README.md CHANGED Viewed

@@ -13,7 +13,7 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 4.1886
 ## Model description
@@ -41,24 +41,23 @@ The following hyperparameters were used during training:
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 100
-- num_epochs: 6
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 5.9345        | 1.0   | 138  | 5.6878          |
-| 4.7674        | 2.0   | 276  | 4.7003          |
-| 3.6914        | 3.0   | 414  | 4.3374          |
-| 3.6076        | 4.0   | 552  | 4.2433          |
-| 3.3436        | 5.0   | 690  | 4.1851          |
-| 2.939         | 6.0   | 828  | 4.1886          |
 ### Framework versions
-- Transformers 4.37.2
 - Pytorch 2.1.2+cu121
 - Datasets 2.16.1
 - Tokenizers 0.15.0

 This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 4.1441
 ## Model description
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 100
+- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 5.7623        | 1.0   | 145  | 5.6148          |
+| 4.6318        | 2.0   | 290  | 4.6321          |
+| 3.8186        | 3.0   | 435  | 4.2714          |
+| 3.447         | 4.0   | 580  | 4.1596          |
+| 3.2664        | 5.0   | 725  | 4.1441          |
 ### Framework versions
+- Transformers 4.39.1
 - Pytorch 2.1.2+cu121
 - Datasets 2.16.1
 - Tokenizers 0.15.0

config.json CHANGED Viewed

@@ -22,7 +22,7 @@
   "rope_theta": 10000.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float32",
-  "transformers_version": "4.37.2",
   "use_cache": true,
-  "vocab_size": 4312
 }

   "rope_theta": 10000.0,
   "tie_word_embeddings": false,
   "torch_dtype": "float32",
+  "transformers_version": "4.39.1",
   "use_cache": true,
+  "vocab_size": 4425
 }

generation_config.json CHANGED Viewed

@@ -3,5 +3,5 @@
   "bos_token_id": 1,
   "eos_token_id": 2,
   "pad_token_id": 0,
-  "transformers_version": "4.37.2"
 }

   "bos_token_id": 1,
   "eos_token_id": 2,
   "pad_token_id": 0,
+  "transformers_version": "4.39.1"
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:7b4bcbb5356dde8a15329ab11495572f2d66bb3dabc8f7a980ddf651d1e4cd90
-size 1344172280

 version https://git-lfs.github.com/spec/v1
+oid sha256:9b0b6f38c404c54d4e5c7875184e8efb93d17e41df03afd29e8128cbb901267c
+size 1345097976

runs/Apr02_11-45-25_gcn21.local.snellius.surf.nl/events.out.tfevents.1712051134.gcn21.local.snellius.surf.nl.4193788.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:667deadef273bc1072c516ddc78e9711e561dbcd34fa9c98bf64726ef75d9252
+size 13842

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:95beafee183dd8c0f8967da1e54d6cb38ab5fe23f886d744fcaba66038c60105
-size 4728

 version https://git-lfs.github.com/spec/v1
+oid sha256:cdc9b49c944ad1d600b4c700b9096b7cb8e099f01d17b84e8484cdb0a927b2fc
+size 4984