train_cola_456_1760637816

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2583
  • Num Input Tokens Seen: 7334376

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2719 1.0 1924 0.2611 366712
0.2655 2.0 3848 0.2600 734016
0.2503 3.0 5772 0.2597 1100824
0.199 4.0 7696 0.2613 1467248
0.2499 5.0 9620 0.2612 1834568
0.2738 6.0 11544 0.2595 2201464
0.2477 7.0 13468 0.2589 2568040
0.1865 8.0 15392 0.2600 2934360
0.3004 9.0 17316 0.2593 3301448
0.2937 10.0 19240 0.2602 3668312
0.265 11.0 21164 0.2593 4034856
0.2789 12.0 23088 0.2589 4401344
0.2578 13.0 25012 0.2591 4767736
0.3012 14.0 26936 0.2584 5134344
0.2325 15.0 28860 0.2585 5501408
0.2727 16.0 30784 0.2583 5867920
0.2254 17.0 32708 0.2586 6234920
0.2709 18.0 34632 0.2587 6601944
0.2282 19.0 36556 0.2587 6968096
0.2455 20.0 38480 0.2586 7334376

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_456_1760637816

Adapter
(2105)
this model

Evaluation results