train_stsb_456_1760637811

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the stsb dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3619
  • Num Input Tokens Seen: 8714656

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.4481 1.0 1294 0.5405 435104
0.4266 2.0 2588 0.5010 870112
0.4134 3.0 3882 0.4435 1305024
0.4529 4.0 5176 0.4423 1742048
0.4069 5.0 6470 0.4360 2176672
0.4466 6.0 7764 0.4350 2613648
0.428 7.0 9058 0.4368 3049776
0.3933 8.0 10352 0.4303 3486928
0.2913 9.0 11646 0.4325 3924192
0.2852 10.0 12940 0.4348 4360736
0.3983 11.0 14234 0.4353 4793520
0.2878 12.0 15528 0.4442 5230528
0.3392 13.0 16822 0.4607 5664848
0.3391 14.0 18116 0.4706 6100288
0.2599 15.0 19410 0.4924 6534240
0.3084 16.0 20704 0.5064 6969936
0.2677 17.0 21998 0.5281 7405056
0.2314 18.0 23292 0.5406 7842624
0.2938 19.0 24586 0.5536 8279952
0.2855 20.0 25880 0.5557 8714656

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_stsb_456_1760637811

Adapter
(2104)
this model

Evaluation results