train_cola_789_1760637932

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2571
  • Num Input Tokens Seen: 7327648

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2779 1.0 1924 0.2667 365728
0.3874 2.0 3848 0.2572 731984
0.364 3.0 5772 0.2676 1098920
0.1556 4.0 7696 0.2792 1465464
0.2329 5.0 9620 0.2564 1831920
0.2149 6.0 11544 0.2562 2198176
0.2522 7.0 13468 0.2556 2564952
0.3015 8.0 15392 0.2574 2931096
0.231 9.0 17316 0.2585 3296808
0.2284 10.0 19240 0.2553 3663512
0.2682 11.0 21164 0.2575 4029608
0.2689 12.0 23088 0.2563 4395616
0.2326 13.0 25012 0.2546 4762456
0.2378 14.0 26936 0.2547 5128712
0.2312 15.0 28860 0.2517 5495008
0.2601 16.0 30784 0.2483 5861104
0.2167 17.0 32708 0.2445 6228320
0.25 18.0 34632 0.2414 6595032
0.1732 19.0 36556 0.2407 6961416
0.2559 20.0 38480 0.2405 7327648

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_789_1760637932

Adapter
(2162)
this model

Evaluation results