train_cola_789_1768397605

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2024
  • Num Input Tokens Seen: 3459744

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.0563 0.5 1924 0.2546 172704
0.3442 1.0 3848 0.2024 345704
0.3025 1.5 5772 0.2383 518760
0.0646 2.0 7696 0.2359 691408
0.0418 2.5 9620 0.2663 865168
0.6154 3.0 11544 0.2419 1037864
0.0014 3.5 13468 0.2601 1210568
0.0008 4.0 15392 0.2400 1383872
0.0015 4.5 17316 0.2624 1557648
0.0007 5.0 19240 0.2528 1729688
0.1866 5.5 21164 0.2502 1902632
0.0013 6.0 23088 0.2731 2075456
0.1663 6.5 25012 0.2795 2248320
0.0026 7.0 26936 0.2680 2421448
0.167 7.5 28860 0.2922 2594888
0.1602 8.0 30784 0.3015 2767560
0.4744 8.5 32708 0.2975 2940312
0.0008 9.0 34632 0.3114 3113600
0.0021 9.5 36556 0.3069 3286752
0.5644 10.0 38480 0.3069 3459744

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_789_1768397605

Adapter
(2376)
this model