train_cola_456_1760637817

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1791
  • Num Input Tokens Seen: 7334376

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2715 1.0 1924 0.1828 366712
0.2248 2.0 3848 0.1504 734016
0.06 3.0 5772 0.1474 1100824
0.0987 4.0 7696 0.1370 1467248
0.1496 5.0 9620 0.1377 1834568
0.1048 6.0 11544 0.1368 2201464
0.0998 7.0 13468 0.1327 2568040
0.0692 8.0 15392 0.1429 2934360
0.1352 9.0 17316 0.1388 3301448
0.1646 10.0 19240 0.1534 3668312
0.0392 11.0 21164 0.1525 4034856
0.0098 12.0 23088 0.1952 4401344
0.0435 13.0 25012 0.2015 4767736
0.0112 14.0 26936 0.2036 5134344
0.0063 15.0 28860 0.2176 5501408
0.0082 16.0 30784 0.2322 5867920
0.0041 17.0 32708 0.2507 6234920
0.0038 18.0 34632 0.2496 6601944
0.0029 19.0 36556 0.2597 6968096
0.0039 20.0 38480 0.2629 7334376

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_456_1760637817

Adapter
(2105)
this model

Evaluation results