train_conala_123_1760637667

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7170
  • Num Input Tokens Seen: 3047552

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
3.2803 1.0 536 2.7423 152672
2.8697 2.0 1072 2.7333 305288
2.9412 3.0 1608 2.7242 457952
2.5292 4.0 2144 2.7203 610944
3.0791 5.0 2680 2.7200 762440
2.6249 6.0 3216 2.7178 914920
3.2274 7.0 3752 2.7170 1067520
3.1557 8.0 4288 2.7171 1220200
3.0671 9.0 4824 2.7177 1372560
3.5507 10.0 5360 2.7179 1524216
3.2952 11.0 5896 2.7186 1675880
2.775 12.0 6432 2.7180 1828344
3.2292 13.0 6968 2.7170 1980376
2.4894 14.0 7504 2.7172 2132544
2.4565 15.0 8040 2.7187 2284440
2.7554 16.0 8576 2.7193 2436520
2.4938 17.0 9112 2.7197 2589096
3.0238 18.0 9648 2.7175 2741936
2.9629 19.0 10184 2.7172 2894976
3.0866 20.0 10720 2.7171 3047552

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_123_1760637667

Adapter
(2129)
this model

Evaluation results