train_cb_42_1760637524

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1769
  • Num Input Tokens Seen: 725992

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3463 1.0 57 0.6231 36480
0.341 2.0 114 0.3803 72112
0.3295 3.0 171 0.3839 108712
0.1486 4.0 228 0.2464 145296
0.273 5.0 285 0.2429 181408
0.1861 6.0 342 0.2956 217760
0.133 7.0 399 0.2893 254568
0.2981 8.0 456 0.2437 291000
0.1107 9.0 513 0.2237 327792
0.1137 10.0 570 0.2338 363864
0.1901 11.0 627 0.2368 400104
0.2099 12.0 684 0.1995 436440
0.0337 13.0 741 0.1697 471944
0.0331 14.0 798 0.2350 508424
0.0566 15.0 855 0.2077 545352
0.0141 16.0 912 0.1869 581368
0.0086 17.0 969 0.2149 616776
0.0092 18.0 1026 0.2464 653152
0.0052 19.0 1083 0.2407 689856
0.0044 20.0 1140 0.2406 725992

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_42_1760637524

Adapter
(2123)
this model

Evaluation results