train_copa_123_1760637648

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3668
  • Num Input Tokens Seen: 563328

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.77 1.0 90 0.6519 28096
0.5543 2.0 180 0.6052 56128
0.4842 3.0 270 0.5422 84352
0.316 4.0 360 0.4641 112576
0.3636 5.0 450 0.4378 140832
0.2814 6.0 540 0.4045 169056
0.4273 7.0 630 0.3931 197344
0.3093 8.0 720 0.3958 225536
0.3219 9.0 810 0.3754 253696
0.318 10.0 900 0.3762 281856
0.5277 11.0 990 0.3770 310080
0.4695 12.0 1080 0.3827 338144
0.5872 13.0 1170 0.3835 366336
0.3013 14.0 1260 0.3836 394464
0.3108 15.0 1350 0.3699 422592
0.2603 16.0 1440 0.3820 450624
0.3758 17.0 1530 0.3668 478720
0.267 18.0 1620 0.3806 507008
0.2433 19.0 1710 0.3904 535136
0.3311 20.0 1800 0.3904 563328

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_123_1760637648

Adapter
(2100)
this model

Evaluation results