train_cb_456_1760637754

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0238
  • Num Input Tokens Seen: 721856

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3667 1.0 57 0.4741 36072
0.4233 2.0 114 0.3161 72896
0.2183 3.0 171 0.2243 109080
0.2406 4.0 228 0.1794 145936
0.2286 5.0 285 0.3171 182024
0.1596 6.0 342 0.1815 218672
0.2326 7.0 399 0.1947 254232
0.2095 8.0 456 0.1713 290912
0.1626 9.0 513 0.1180 326432
0.1593 10.0 570 0.0871 362240
0.1403 11.0 627 0.0951 397880
0.2325 12.0 684 0.1101 433352
0.0834 13.0 741 0.0626 469568
0.0666 14.0 798 0.0534 505048
0.0492 15.0 855 0.0238 541088
0.0203 16.0 912 0.0278 577512
0.0059 17.0 969 0.0289 614128
0.0118 18.0 1026 0.0288 649608
0.0049 19.0 1083 0.0278 685200
0.0068 20.0 1140 0.0271 721856

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_456_1760637754

Adapter
(2100)
this model

Evaluation results