train_cb_1757340166 / README.md
rbelanec's picture
End of training
e4dbc47 verified
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-factory
  - prefix-tuning
  - generated_from_trainer
model-index:
  - name: train_cb_1757340166
    results: []

train_cb_1757340166

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3968
  • Num Input Tokens Seen: 621640

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.187 1.0 113 0.5789 31088
0.4145 2.0 226 0.2959 61872
0.1423 3.0 339 0.2717 93016
0.3879 4.0 452 0.2208 124056
0.3181 5.0 565 0.1915 155240
0.3964 6.0 678 0.2505 185984
0.0106 7.0 791 0.3052 217192
0.1278 8.0 904 0.2324 248456
0.1891 9.0 1017 0.6020 279744
0.0002 10.0 1130 0.3493 310888
0.0001 11.0 1243 0.3753 341832
0.0001 12.0 1356 0.3776 372952
0.0001 13.0 1469 0.3861 403768
0.0 14.0 1582 0.3914 434704
0.0 15.0 1695 0.3901 466016
0.0001 16.0 1808 0.3942 497200
0.0 17.0 1921 0.3934 528320
0.0 18.0 2034 0.3991 559408
0.0 19.0 2147 0.3979 590544
0.0 20.0 2260 0.3968 621640

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1