train_qnli_42_1760637632

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5953
  • Num Input Tokens Seen: 207226464

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1185 1.0 23567 0.0459 10362048
0.0601 2.0 47134 0.0417 20723232
0.0585 3.0 70701 0.0388 31087712
0.0254 4.0 94268 0.0391 41448512
0.0693 5.0 117835 0.0396 51808576
0.086 6.0 141402 0.0377 62164384
0.0488 7.0 164969 0.0382 72528320
0.0234 8.0 188536 0.0382 82892896
0.0076 9.0 212103 0.0406 93260448
0.0057 10.0 235670 0.0401 103622208
0.0103 11.0 259237 0.0426 113983328
0.0284 12.0 282804 0.0462 124345088
0.0095 13.0 306371 0.0479 134702688
0.0028 14.0 329938 0.0514 145065792
0.0081 15.0 353505 0.0545 155429344
0.0849 16.0 377072 0.0630 165793952
0.0121 17.0 400639 0.0662 176154208
0.0033 18.0 424206 0.0705 186506176
0.0017 19.0 447773 0.0726 196865248
0.0012 20.0 471340 0.0735 207226464

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_qnli_42_1760637632

Adapter
(2398)
this model