train_cb_101112_1760637984

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9753
  • Num Input Tokens Seen: 723584

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
1.2005 1.0 57 1.1137 36112
0.9465 2.0 114 1.1054 71552
1.1672 3.0 171 1.0676 108088
1.0103 4.0 228 1.0539 144720
0.9606 5.0 285 1.0277 181120
0.8504 6.0 342 1.0145 217128
0.956 7.0 399 1.0001 253536
0.9733 8.0 456 1.0024 290112
0.9202 9.0 513 0.9897 325872
0.9889 10.0 570 0.9847 361920
0.9169 11.0 627 0.9753 398432
0.8499 12.0 684 0.9816 435536
0.9133 13.0 741 0.9839 471520
0.9967 14.0 798 0.9842 507256
1.1597 15.0 855 0.9895 543064
1.075 16.0 912 0.9756 579704
1.0447 17.0 969 0.9865 615960
0.7836 18.0 1026 0.9890 652368
0.7811 19.0 1083 0.9975 687976
0.9389 20.0 1140 0.9975 723584

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_101112_1760637984

Adapter
(2104)
this model

Evaluation results