rbelanec
/

train_qqp_1754652135

+---
+library_name: peft
+license: llama3
+base_model: meta-llama/Meta-Llama-3-8B-Instruct
+tags:
+- llama-factory
+- generated_from_trainer
+model-index:
+- name: train_qqp_1754652135
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# train_qqp_1754652135
+This model is a fine-tuned version of [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.1873
+- Num Input Tokens Seen: 250787112
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-05
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 123
+- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 10.0
+### Training results
+| Training Loss | Epoch | Step   | Validation Loss | Input Tokens Seen |
+|:-------------:|:-----:|:------:|:---------------:|:-----------------:|
+| 0.2031        | 0.5   | 40933  | 0.2528          | 12552544          |
+| 0.2786        | 1.0   | 81866  | 0.2403          | 25087944          |
+| 0.2791        | 1.5   | 122799 | 0.2292          | 37621672          |
+| 0.2403        | 2.0   | 163732 | 0.2193          | 50164864          |
+| 0.2049        | 2.5   | 204665 | 0.2120          | 62700096          |
+| 0.1847        | 3.0   | 245598 | 0.2067          | 75242048          |
+| 0.181         | 3.5   | 286531 | 0.2104          | 87769248          |
+| 0.2662        | 4.0   | 327464 | 0.1999          | 100320328         |
+| 0.2374        | 4.5   | 368397 | 0.2003          | 112855464         |
+| 0.2511        | 5.0   | 409330 | 0.1927          | 125387608         |
+| 0.1266        | 5.5   | 450263 | 0.1914          | 137931704         |
+| 0.1893        | 6.0   | 491196 | 0.1936          | 150463800         |
+| 0.2127        | 6.5   | 532129 | 0.1902          | 163003576         |
+| 0.2026        | 7.0   | 573062 | 0.1895          | 175543400         |
+| 0.1628        | 7.5   | 613995 | 0.1886          | 188096200         |
+| 0.2036        | 8.0   | 654928 | 0.1866          | 200622552         |
+| 0.1424        | 8.5   | 695861 | 0.1876          | 213151000         |
+| 0.1818        | 9.0   | 736794 | 0.1872          | 225701792         |
+| 0.1845        | 9.5   | 777727 | 0.1875          | 238243968         |
+| 0.2161        | 10.0  | 818660 | 0.1873          | 250787112         |
+### Framework versions
+- PEFT 0.15.2
+- Transformers 4.51.3
+- Pytorch 2.8.0+cu128
+- Datasets 3.6.0
+- Tokenizers 0.21.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4dae968bd0ba3f1f77c38f1bb8b7baa2c0181fff4f52991ba47a28ad6baee439
 size 26214528

 version https://git-lfs.github.com/spec/v1
+oid sha256:85dab3ad75713169c7f941fd2e461b8893d267ece85847b26b5c70b9edf7ae5d
 size 26214528