train_cola_789_1760637935

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1481
  • Num Input Tokens Seen: 7327648

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.3136 1.0 1924 0.1945 365728
0.1831 2.0 3848 0.1684 731984
0.3412 3.0 5772 0.1675 1098920
0.1039 4.0 7696 0.1565 1465464
0.0719 5.0 9620 0.1539 1831920
0.1877 6.0 11544 0.1524 2198176
0.1522 7.0 13468 0.1510 2564952
0.19 8.0 15392 0.1506 2931096
0.0875 9.0 17316 0.1526 3296808
0.1382 10.0 19240 0.1508 3663512
0.1887 11.0 21164 0.1481 4029608
0.1321 12.0 23088 0.1523 4395616
0.1217 13.0 25012 0.1491 4762456
0.0592 14.0 26936 0.1506 5128712
0.16 15.0 28860 0.1494 5495008
0.091 16.0 30784 0.1516 5861104
0.1858 17.0 32708 0.1507 6228320
0.1925 18.0 34632 0.1496 6595032
0.0826 19.0 36556 0.1494 6961416
0.0545 20.0 38480 0.1491 7327648

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_789_1760637935

Adapter
(2107)
this model

Evaluation results