train_cb_42_1757595249

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1635
  • Num Input Tokens Seen: 621640

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.233 1.0 113 0.4873 31088
0.642 2.0 226 0.4480 61872
0.0744 3.0 339 0.3388 93016
0.3811 4.0 452 0.3027 124056
0.2857 5.0 565 0.2438 155240
0.319 6.0 678 0.2082 185984
0.0015 7.0 791 0.3063 217192
0.0786 8.0 904 0.1092 248456
0.1621 9.0 1017 0.2128 279744
0.0004 10.0 1130 0.2852 310888
0.0001 11.0 1243 0.1720 341832
0.0001 12.0 1356 0.1684 372952
0.0001 13.0 1469 0.1634 403768
0.0 14.0 1582 0.1620 434704
0.0 15.0 1695 0.1667 466016
0.0 16.0 1808 0.1622 497200
0.0001 17.0 1921 0.1631 528320
0.0 18.0 2034 0.1550 559408
0.0 19.0 2147 0.1626 590544
0.0 20.0 2260 0.1635 621640

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_42_1757595249

Adapter
(2104)
this model

Evaluation results