train_conala_123_1760637668

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6767
  • Num Input Tokens Seen: 3047552

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.7358 1.0 536 2.1967 152672
1.261 2.0 1072 1.0879 305288
0.758 3.0 1608 0.9051 457952
0.5597 4.0 2144 0.8311 610944
1.1423 5.0 2680 0.7915 762440
0.7895 6.0 3216 0.7657 914920
0.5906 7.0 3752 0.7451 1067520
0.7893 8.0 4288 0.7291 1220200
0.919 9.0 4824 0.7161 1372560
0.4873 10.0 5360 0.7068 1524216
0.8047 11.0 5896 0.6995 1675880
0.7024 12.0 6432 0.6918 1828344
0.845 13.0 6968 0.6872 1980376
0.5402 14.0 7504 0.6830 2132544
0.5934 15.0 8040 0.6805 2284440
0.5969 16.0 8576 0.6789 2436520
0.4848 17.0 9112 0.6784 2589096
0.7806 18.0 9648 0.6770 2741936
0.6403 19.0 10184 0.6768 2894976
0.8095 20.0 10720 0.6767 3047552

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_123_1760637668

Adapter
(2129)
this model

Evaluation results