train_cola_1755694493

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3498
  • Num Input Tokens Seen: 3465288

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2119 0.5 1924 0.2508 173872
0.1252 1.0 3848 0.2795 346872
0.2905 1.5 5772 0.2591 520296
0.31 2.0 7696 0.2402 693752
0.243 2.5 9620 0.2488 867416
0.2176 3.0 11544 0.2401 1040128
0.2172 3.5 13468 0.2428 1212976
0.2667 4.0 15392 0.2426 1386696
0.2669 4.5 17316 0.2381 1559896
0.2104 5.0 19240 0.2482 1733072
0.2037 5.5 21164 0.2389 1906160
0.1723 6.0 23088 0.2377 2079640
0.147 6.5 25012 0.2382 2253000
0.3044 7.0 26936 0.2424 2425920
0.3173 7.5 28860 0.2561 2598960
0.2224 8.0 30784 0.2512 2772144
0.1814 8.5 32708 0.3283 2944864
0.2271 9.0 34632 0.3048 3118472
0.1103 9.5 36556 0.3464 3291720
0.2212 10.0 38480 0.3498 3465288

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_1755694493

Adapter
(2102)
this model

Evaluation results