train_cola_456_1760637821

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1462
  • Num Input Tokens Seen: 7334376

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.343 1.0 1924 0.2076 366712
0.1676 2.0 3848 0.1700 734016
0.1305 3.0 5772 0.1598 1100824
0.1038 4.0 7696 0.1545 1467248
0.1421 5.0 9620 0.1540 1834568
0.1661 6.0 11544 0.1500 2201464
0.1069 7.0 13468 0.1486 2568040
0.1344 8.0 15392 0.1488 2934360
0.1965 9.0 17316 0.1477 3301448
0.2095 10.0 19240 0.1473 3668312
0.1826 11.0 21164 0.1466 4034856
0.0752 12.0 23088 0.1464 4401344
0.1447 13.0 25012 0.1476 4767736
0.149 14.0 26936 0.1475 5134344
0.141 15.0 28860 0.1468 5501408
0.1746 16.0 30784 0.1468 5867920
0.1196 17.0 32708 0.1467 6234920
0.109 18.0 34632 0.1462 6601944
0.0517 19.0 36556 0.1470 6968096
0.1629 20.0 38480 0.1465 7334376

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_456_1760637821

Adapter
(2106)
this model

Evaluation results