train_multirc_1754502823

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the multirc dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2072
  • Num Input Tokens Seen: 132272272

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3572 0.5 3065 0.2742 6639424
0.4476 1.0 6130 0.2313 13255424
0.1739 1.5 9195 0.2072 19871232
0.3827 2.0 12260 0.2300 26471216
0.1174 2.5 15325 0.2256 33075856
0.1551 3.0 18390 0.2537 39694112
0.1436 3.5 21455 0.2342 46313216
0.0008 4.0 24520 0.2358 52929744
0.5876 4.5 27585 0.2123 59549072
0.1874 5.0 30650 0.2234 66152480
0.3621 5.5 33715 0.2219 72765696
0.0772 6.0 36780 0.2299 79389648
0.2705 6.5 39845 0.2456 86008784
0.2328 7.0 42910 0.2416 92621824
0.2648 7.5 45975 0.2336 99237152
0.0007 8.0 49040 0.2341 105830544
0.1746 8.5 52105 0.2351 112458064
0.0891 9.0 55170 0.2340 119047920
0.486 9.5 58235 0.2340 125686064
0.269 10.0 61300 0.2361 132272272

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_multirc_1754502823

Adapter
(2101)
this model

Evaluation results