train_copa_123_1760637645

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2329
  • Num Input Tokens Seen: 563328

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2863 1.0 90 0.2360 28096
0.2354 2.0 180 0.2333 56128
0.2629 3.0 270 0.2331 84352
0.2408 4.0 360 0.2304 112576
0.2297 5.0 450 0.2377 140832
0.232 6.0 540 0.2334 169056
0.2277 7.0 630 0.2333 197344
0.2353 8.0 720 0.2349 225536
0.2399 9.0 810 0.2331 253696
0.2439 10.0 900 0.2344 281856
0.2264 11.0 990 0.2322 310080
0.2338 12.0 1080 0.2341 338144
0.2345 13.0 1170 0.2311 366336
0.2287 14.0 1260 0.2354 394464
0.2284 15.0 1350 0.2347 422592
0.2262 16.0 1440 0.2310 450624
0.2316 17.0 1530 0.2371 478720
0.2338 18.0 1620 0.2334 507008
0.2265 19.0 1710 0.2336 535136
0.2275 20.0 1800 0.2341 563328

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_123_1760637645

Adapter
(2100)
this model

Evaluation results