rbelanec
/

train_piqa_1754507485

+---
+library_name: peft
+license: llama3
+base_model: meta-llama/Meta-Llama-3-8B-Instruct
+tags:
+- llama-factory
+- generated_from_trainer
+model-index:
+- name: train_piqa_1754507485
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# train_piqa_1754507485
+This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1620
+- Num Input Tokens Seen: 22103448
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 123
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 10.0
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Input Tokens Seen |
+|:-------------:|:-----:|:-----:|:---------------:|:-----------------:|
+| 0.2639        | 0.5   | 1813  | 0.1444          | 1118368           |
+| 0.0831        | 1.0   | 3626  | 0.1175          | 2216600           |
+| 0.1061        | 1.5   | 5439  | 0.1138          | 3320792           |
+| 0.0529        | 2.0   | 7252  | 0.1122          | 4419000           |
+| 0.1256        | 2.5   | 9065  | 0.1134          | 5525176           |
+| 0.111         | 3.0   | 10878 | 0.1149          | 6628280           |
+| 0.0669        | 3.5   | 12691 | 0.1314          | 7736376           |
+| 0.0898        | 4.0   | 14504 | 0.1165          | 8844408           |
+| 0.0717        | 4.5   | 16317 | 0.1297          | 9951832           |
+| 0.0139        | 5.0   | 18130 | 0.1254          | 11048200          |
+| 0.042         | 5.5   | 19943 | 0.1346          | 12157032          |
+| 0.0054        | 6.0   | 21756 | 0.1426          | 13257624          |
+| 0.0593        | 6.5   | 23569 | 0.1500          | 14360952          |
+| 0.1229        | 7.0   | 25382 | 0.1525          | 15468632          |
+| 0.0529        | 7.5   | 27195 | 0.1517          | 16574840          |
+| 0.0551        | 8.0   | 29008 | 0.1577          | 17678024          |
+| 0.05          | 8.5   | 30821 | 0.1586          | 18780040          |
+| 0.0687        | 9.0   | 32634 | 0.1605          | 19894712          |
+| 0.0618        | 9.5   | 34447 | 0.1612          | 21014840          |
+| 0.0012        | 10.0  | 36260 | 0.1620          | 22103448          |
+### Framework versions
+- PEFT 0.15.2
+- Transformers 4.51.3
+- Pytorch 2.8.0+cu128
+- Datasets 3.6.0
+- Tokenizers 0.21.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:26a17ccd4cb1a3bab173a66a27d97f23ca42f43c24efb17e95eae445fc3dfef2
 size 1074144

 version https://git-lfs.github.com/spec/v1
+oid sha256:19a1f95d9f1ab58bc1d2e8c7e2947eb1fc947e11c6afc870e1210badddd0b1dc
 size 1074144