train_cola_789_1760637930

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2552
  • Num Input Tokens Seen: 7327648

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2775 1.0 1924 0.2689 365728
0.3636 2.0 3848 0.2573 731984
0.3798 3.0 5772 0.2567 1098920
0.1577 4.0 7696 0.2727 1465464
0.2357 5.0 9620 0.2564 1831920
0.2165 6.0 11544 0.2563 2198176
0.2541 7.0 13468 0.2560 2564952
0.3109 8.0 15392 0.2568 2931096
0.2238 9.0 17316 0.2581 3296808
0.2143 10.0 19240 0.2566 3663512
0.2707 11.0 21164 0.2574 4029608
0.2783 12.0 23088 0.2564 4395616
0.2444 13.0 25012 0.2564 4762456
0.222 14.0 26936 0.2553 5128712
0.258 15.0 28860 0.2558 5495008
0.2634 16.0 30784 0.2564 5861104
0.2836 17.0 32708 0.2553 6228320
0.2879 18.0 34632 0.2557 6595032
0.2208 19.0 36556 0.2555 6961416
0.2898 20.0 38480 0.2552 7327648

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_789_1760637930

Adapter
(2105)
this model

Evaluation results