train_boolq_42_1760756333

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1630
  • Num Input Tokens Seen: 42773120

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2805 1.0 2121 0.1216 2135488
0.1344 2.0 4242 0.1526 4271424
0.0893 3.0 6363 0.1049 6407520
0.3395 4.0 8484 0.1088 8553728
0.0391 5.0 10605 0.1078 10692704
0.0591 6.0 12726 0.1136 12829472
0.0594 7.0 14847 0.1130 14967104
0.1401 8.0 16968 0.1154 17105760
0.0458 9.0 19089 0.1282 19246048
0.0556 10.0 21210 0.1424 21382880
0.0288 11.0 23331 0.1400 23522528
0.0099 12.0 25452 0.1645 25662176
0.0396 13.0 27573 0.1945 27797760
0.0069 14.0 29694 0.2178 29933184
0.0016 15.0 31815 0.2434 32075552
0.0139 16.0 33936 0.2448 34216384
0.0047 17.0 36057 0.2532 36358080
0.0008 18.0 38178 0.2880 38498720
0.0752 19.0 40299 0.2973 40635680
0.0016 20.0 42420 0.2996 42773120

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_boolq_42_1760756333

Adapter
(2105)
this model

Evaluation results