train_conala_123_1760637665

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0658
  • Num Input Tokens Seen: 3047552

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.8894 1.0 536 0.6439 152672
0.6937 2.0 1072 0.6180 305288
0.4875 3.0 1608 0.5986 457952
0.4151 4.0 2144 0.6029 610944
0.8716 5.0 2680 0.6069 762440
0.5753 6.0 3216 0.6013 914920
0.4714 7.0 3752 0.6204 1067520
0.549 8.0 4288 0.6045 1220200
0.6915 9.0 4824 0.6048 1372560
0.3567 10.0 5360 0.6144 1524216
0.5933 11.0 5896 0.6268 1675880
0.5259 12.0 6432 0.6402 1828344
0.489 13.0 6968 0.6632 1980376
0.2976 14.0 7504 0.6913 2132544
0.3224 15.0 8040 0.7080 2284440
0.2692 16.0 8576 0.7352 2436520
0.2695 17.0 9112 0.7561 2589096
0.3296 18.0 9648 0.7822 2741936
0.2217 19.0 10184 0.7954 2894976
0.2721 20.0 10720 0.8006 3047552

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_123_1760637665

Adapter
(2106)
this model

Evaluation results