train_boolq_42_1776331558

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1885
  • Num Input Tokens Seen: 12333600

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-06
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2277 0.2507 266 0.2505 618432
0.2193 0.5014 532 0.3166 1225408
0.2554 0.7521 798 0.2179 1851072
0.3676 1.0028 1064 0.1885 2475808
0.165 1.2535 1330 0.4608 3091552
0.2207 1.5042 1596 0.3545 3699104
0.1138 1.7549 1862 0.3500 4324256
0.0762 2.0057 2128 0.3345 4940992
0.0898 2.2564 2394 0.4647 5558144
0.0692 2.5071 2660 0.4098 6183872
0.227 2.7578 2926 0.4303 6806208
0.0004 3.0085 3192 0.3937 7421856
0.0 3.2592 3458 0.5191 8043744
0.0002 3.5099 3724 0.4636 8660768
0.0 3.7606 3990 0.5201 9286304
0.0001 4.0113 4256 0.5146 9894624
0.0782 4.2620 4522 0.5548 10512416
0.0 4.5127 4788 0.5418 11115040
0.0 4.7634 5054 0.5422 11736672

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.10.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
3
Safetensors
Model size
1B params
Tensor type
F32
·
Inference Providers NEW
Input a message to start chatting with rbelanec/train_boolq_42_1776331558.

Model tree for rbelanec/train_boolq_42_1776331558

Finetuned
(1747)
this model