train_cb_1757340217

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3687
  • Num Input Tokens Seen: 359688

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.4646 0.5088 29 0.2152 18048
0.4224 1.0175 58 0.2754 36328
0.1384 1.5263 87 0.1717 56168
0.9708 2.0351 116 0.1764 73792
0.187 2.5439 145 0.1948 92768
0.1273 3.0526 174 0.1582 110064
0.0853 3.5614 203 0.1560 129808
0.5225 4.0702 232 0.2043 147240
0.0862 4.5789 261 0.2014 164744
0.0088 5.0877 290 0.1787 184440
0.0051 5.5965 319 0.1657 202456
0.001 6.1053 348 0.1849 220288
0.1069 6.6140 377 0.1918 239200
0.0181 7.1228 406 0.1944 256296
0.0004 7.6316 435 0.1970 275688
0.0886 8.1404 464 0.1968 294608
0.1007 8.6491 493 0.2060 312144
0.0083 9.1579 522 0.2111 330152
0.0012 9.6667 551 0.2125 347976

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1757340217

Adapter
(2099)
this model

Evaluation results