train_conala_123_1760637663

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5831
  • Num Input Tokens Seen: 3047552

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.8392 1.0 536 0.6565 152672
0.6713 2.0 1072 0.5939 305288
0.4895 3.0 1608 0.5831 457952
0.3873 4.0 2144 0.5936 610944
0.8018 5.0 2680 0.6043 762440
0.4785 6.0 3216 0.6082 914920
0.3683 7.0 3752 0.6279 1067520
0.4296 8.0 4288 0.6276 1220200
0.5198 9.0 4824 0.6657 1372560
0.2031 10.0 5360 0.6830 1524216
0.2693 11.0 5896 0.7332 1675880
0.2437 12.0 6432 0.7569 1828344
0.2029 13.0 6968 0.8369 1980376
0.0736 14.0 7504 0.9274 2132544
0.0333 15.0 8040 1.0068 2284440
0.0444 16.0 8576 1.0613 2436520
0.0281 17.0 9112 1.0871 2589096
0.0283 18.0 9648 1.0918 2741936
0.0485 19.0 10184 1.0929 2894976
0.0444 20.0 10720 1.0931 3047552

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_123_1760637663

Adapter
(2105)
this model

Evaluation results