train_conala_101112_1760638007

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3805
  • Num Input Tokens Seen: 3060208

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 101112
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6438 1.0 536 0.6843 153344
1.0496 2.0 1072 0.6309 306640
0.7643 3.0 1608 0.6312 459376
0.5342 4.0 2144 0.6151 612008
0.5146 5.0 2680 0.6183 764936
0.5258 6.0 3216 0.6066 917624
0.4711 7.0 3752 0.6022 1070488
0.3759 8.0 4288 0.6057 1223384
0.4567 9.0 4824 0.6206 1376240
0.2378 10.0 5360 0.6316 1529640
0.5057 11.0 5896 0.6305 1682336
0.2978 12.0 6432 0.6605 1835928
0.4438 13.0 6968 0.6747 1989136
0.3008 14.0 7504 0.7078 2142632
0.3878 15.0 8040 0.7390 2295280
0.2786 16.0 8576 0.7537 2447904
0.3253 17.0 9112 0.7935 2600776
0.2002 18.0 9648 0.8137 2753536
0.2772 19.0 10184 0.8314 2906984
0.2039 20.0 10720 0.8332 3060208

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_101112_1760638007

Adapter
(2124)
this model

Evaluation results