train_stsb_456_1760637814

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the stsb dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4446
  • Num Input Tokens Seen: 8714656

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
1.0243 1.0 1294 1.0522 435104
0.5072 2.0 2588 0.6118 870112
0.4585 3.0 3882 0.5377 1305024
0.4735 4.0 5176 0.5074 1742048
0.4916 5.0 6470 0.4902 2176672
0.4197 6.0 7764 0.4768 2613648
0.5125 7.0 9058 0.4705 3049776
0.5041 8.0 10352 0.4627 3486928
0.3721 9.0 11646 0.4601 3924192
0.3501 10.0 12940 0.4545 4360736
0.603 11.0 14234 0.4522 4793520
0.3599 12.0 15528 0.4499 5230528
0.5075 13.0 16822 0.4493 5664848
0.561 14.0 18116 0.4477 6100288
0.3745 15.0 19410 0.4466 6534240
0.5419 16.0 20704 0.4456 6969936
0.4025 17.0 21998 0.4446 7405056
0.3401 18.0 23292 0.4447 7842624
0.4637 19.0 24586 0.4462 8279952
0.5089 20.0 25880 0.4453 8714656

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
6
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_stsb_456_1760637814

Adapter
(2100)
this model

Evaluation results