train_conala_101112_1760638006

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5924
  • Num Input Tokens Seen: 3060208

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6042 1.0 536 0.6518 153344
1.0162 2.0 1072 0.6180 306640
0.7428 3.0 1608 0.6011 459376
0.5338 4.0 2144 0.5957 612008
0.4754 5.0 2680 0.6104 764936
0.4787 6.0 3216 0.5924 917624
0.3919 7.0 3752 0.6076 1070488
0.3262 8.0 4288 0.6097 1223384
0.3219 9.0 4824 0.6546 1376240
0.1384 10.0 5360 0.6881 1529640
0.2843 11.0 5896 0.7035 1682336
0.1502 12.0 6432 0.7622 1835928
0.1751 13.0 6968 0.8449 1989136
0.1135 14.0 7504 0.8852 2142632
0.0843 15.0 8040 0.9656 2295280
0.023 16.0 8576 1.0137 2447904
0.0671 17.0 9112 1.0530 2600776
0.0236 18.0 9648 1.0637 2753536
0.0458 19.0 10184 1.0654 2906984
0.0345 20.0 10720 1.0637 3060208

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_101112_1760638006

Adapter
(2155)
this model

Evaluation results