train_cb_42_1767875486

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1156
  • Num Input Tokens Seen: 313000

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.4795 0.5044 57 0.2893 15552
0.3258 1.0088 114 0.1449 31616
0.0192 1.5133 171 0.1440 46416
0.4173 2.0177 228 0.1411 62960
0.1101 2.5221 285 0.1376 78816
1.0026 3.0265 342 0.1310 94776
0.7717 3.5310 399 0.1266 110632
0.4419 4.0354 456 0.1324 125984
0.0126 4.5398 513 0.1156 140528
0.0181 5.0442 570 0.1495 157360
0.4873 5.5487 627 0.1534 173696
0.0234 6.0531 684 0.1282 188944
0.011 6.5575 741 0.1217 205552
0.0517 7.0619 798 0.1336 220952
0.0033 7.5664 855 0.1375 236872
0.3956 8.0708 912 0.1351 252408
0.0014 8.5752 969 0.1361 268296
0.0006 9.0796 1026 0.1273 284224
0.0706 9.5841 1083 0.1289 300048

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_42_1767875486

Adapter
(2202)
this model