train_conala_123_1760637666

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5742
  • Num Input Tokens Seen: 3047552

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.8413 1.0 536 0.6011 152672
0.6224 2.0 1072 0.5773 305288
0.4317 3.0 1608 0.5742 457952
0.2788 4.0 2144 0.5962 610944
0.4801 5.0 2680 0.6492 762440
0.2656 6.0 3216 0.7369 914920
0.1983 7.0 3752 0.8697 1067520
0.1768 8.0 4288 0.9656 1220200
0.11 9.0 4824 1.0390 1372560
0.0316 10.0 5360 1.0801 1524216
0.0699 11.0 5896 1.2046 1675880
0.0556 12.0 6432 1.1798 1828344
0.0768 13.0 6968 1.2252 1980376
0.0248 14.0 7504 1.2389 2132544
0.0014 15.0 8040 1.3620 2284440
0.018 16.0 8576 1.4280 2436520
0.0004 17.0 9112 1.4676 2589096
0.0019 18.0 9648 1.5184 2741936
0.0222 19.0 10184 1.5373 2894976
0.0093 20.0 10720 1.5436 3047552

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_123_1760637666

Adapter
(2155)
this model

Evaluation results