test / README.md
rbelanec's picture
End of training
e7ecc5e verified
|
raw
history blame
2.75 kB
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - base_model:adapter:meta-llama/Meta-Llama-3-8B-Instruct
  - llama-factory
  - transformers
pipeline_tag: text-generation
model-index:
  - name: test
    results: []

test

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the wsc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3497
  • Num Input Tokens Seen: 49376

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
3.6273 0.056 7 6.9320 2880
6.3813 0.112 14 1.6228 5920
1.4507 0.168 21 0.4040 8416
1.7771 0.224 28 3.6187 11264
0.7848 0.28 35 0.3667 13824
0.4314 0.336 42 0.3662 16672
0.4096 0.392 49 0.5265 19296
0.5554 0.448 56 0.3925 22432
0.4968 0.504 63 2.6525 25504
0.3298 0.56 70 0.3776 28064
0.3663 0.616 77 0.3627 30720
0.3654 0.672 84 0.3526 33504
0.3495 0.728 91 0.3546 36128
0.412 0.784 98 0.3497 38592
0.349 0.84 105 0.3538 41280
0.3482 0.896 112 0.3566 44160
0.3258 0.952 119 0.3585 46944

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4