train_qnli_123_1760637747

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0373
  • Num Input Tokens Seen: 207208704

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.0291 1.0 23567 0.0420 10365216
0.1374 2.0 47134 0.0393 20725024
0.0229 3.0 70701 0.0442 31080960
0.0314 4.0 94268 0.0377 41439424
0.0242 5.0 117835 0.0379 51801184
0.0929 6.0 141402 0.0405 62164704
0.0242 7.0 164969 0.0374 72529184
0.0381 8.0 188536 0.0373 82884480
0.0195 9.0 212103 0.0374 93243840
0.0154 10.0 235670 0.0376 103607072
0.0237 11.0 259237 0.0379 113965760
0.021 12.0 282804 0.0394 124331968
0.041 13.0 306371 0.0392 134696864
0.0374 14.0 329938 0.0397 145056992
0.0052 15.0 353505 0.0396 155415232
0.0184 16.0 377072 0.0398 165776960
0.0206 17.0 400639 0.0396 176136864
0.0135 18.0 424206 0.0396 186488608
0.0109 19.0 447773 0.0396 196849632
0.0416 20.0 471340 0.0396 207208704

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_qnli_123_1760637747

Adapter
(2401)
this model