train_qnli_1755619685

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the qnli dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1737
  • Num Input Tokens Seen: 94426336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • distributed_type: multi-GPU
  • num_devices: 2
  • total_train_batch_size: 4
  • total_eval_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1443 0.5000 11784 0.1060 4726400
0.0196 1.0000 23568 0.0572 9444272
0.1075 1.5001 35352 0.0433 14172240
0.011 2.0001 47136 0.0372 18886368
0.0451 2.5001 58920 0.0397 23595456
0.0107 3.0001 70704 0.0555 28323552
0.0558 3.5001 82488 0.0385 33045520
0.0713 4.0002 94272 0.0360 37766912
0.0149 4.5002 106056 0.0393 42486256
0.0617 5.0002 117840 0.0394 47210640
0.0039 5.5002 129624 0.0463 51929904
0.0071 6.0003 141408 0.0468 56656208
0.0586 6.5003 153192 0.0671 61382064
0.0014 7.0003 164976 0.0653 66103104
0.0001 7.5003 176760 0.0901 70824800
0.0003 8.0003 188544 0.0864 75545952
0.0 8.5004 200328 0.1408 80268160
0.0 9.0004 212112 0.1439 84990112
0.0 9.5004 223896 0.1708 89707216

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_qnli_1755619685

Adapter
(2397)
this model