train_copa_1757340277

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0768
  • Num Input Tokens Seen: 281312

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.081 0.5 45 0.1215 14144
0.3339 1.0 90 0.1037 28192
0.0463 1.5 135 0.0964 42208
0.082 2.0 180 0.0777 56256
0.2678 2.5 225 0.0822 70368
0.0878 3.0 270 0.0920 84320
0.1495 3.5 315 0.1532 98400
0.0075 4.0 360 0.0768 112416
0.429 4.5 405 0.1562 126496
0.0002 5.0 450 0.1207 140544
0.0092 5.5 495 0.1345 154592
0.002 6.0 540 0.1524 168768
0.0064 6.5 585 0.1678 182848
0.0449 7.0 630 0.1447 196896
0.1323 7.5 675 0.1635 210912
0.0001 8.0 720 0.2237 225024
0.0211 8.5 765 0.2088 239200
0.0121 9.0 810 0.2073 253152
0.0034 9.5 855 0.2088 267040
0.1445 10.0 900 0.2092 281312

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1757340277

Adapter
(2104)
this model

Evaluation results