train_siqa_789_1760637943

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the siqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5495
  • Num Input Tokens Seen: 60282336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5418 1.0 7518 0.5501 3013096
0.5492 2.0 15036 0.5515 6027776
0.5499 3.0 22554 0.5495 9041456
0.5554 4.0 30072 0.5500 12057104
0.5545 5.0 37590 0.5499 15069560
0.5543 6.0 45108 0.5510 18083408
0.5609 7.0 52626 0.5501 21096352
0.5463 8.0 60144 0.5508 24110848
0.5496 9.0 67662 0.5497 27125432
0.5499 10.0 75180 0.5496 30138560
0.55 11.0 82698 0.5499 33152696
0.5514 12.0 90216 0.5496 36168144
0.5551 13.0 97734 0.5498 39182840
0.5562 14.0 105252 0.5495 42195440
0.5482 15.0 112770 0.5497 45209672
0.5476 16.0 120288 0.5498 48221640
0.5509 17.0 127806 0.5496 51235560
0.5542 18.0 135324 0.5499 54251480
0.5479 19.0 142842 0.5496 57267064
0.5519 20.0 150360 0.5500 60282336

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_siqa_789_1760637943

Adapter
(2108)
this model

Evaluation results