train_cola_42_1763998305

This model is a fine-tuned version of meta-llama/Llama-3.2-1B-Instruct on the cola dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2561
  • Num Input Tokens Seen: 3463336

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.474 0.5 1924 0.4129 173168
0.1377 1.0 3848 0.2851 346040
0.3359 1.5 5772 0.2766 518984
0.0066 2.0 7696 0.3507 692368
0.0329 2.5 9620 0.2888 866112
0.1656 3.0 11544 0.2561 1039080
0.0726 3.5 13468 0.2908 1212120
0.3968 4.0 15392 0.2934 1385192
0.5147 4.5 17316 0.3050 1558248
0.223 5.0 19240 0.2613 1731824
0.2536 5.5 21164 0.3119 1904960
0.5499 6.0 23088 0.3149 2078408
0.3069 6.5 25012 0.2976 2251848
0.3753 7.0 26936 0.2983 2424592
0.2871 7.5 28860 0.3098 2597104
0.3502 8.0 30784 0.3049 2770768
0.4523 8.5 32708 0.3036 2944224
0.2483 9.0 34632 0.3079 3117120
0.0411 9.5 36556 0.3086 3290224
0.0015 10.0 38480 0.3065 3463336

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.1+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cola_42_1763998305

Adapter
(510)
this model

Evaluation results