train_conala_42_1760637548

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6289
  • Num Input Tokens Seen: 3049984

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.03
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.8156 1.0 536 0.6592 153352
0.6364 2.0 1072 0.6650 305496
0.3716 3.0 1608 0.6289 458160
0.4885 4.0 2144 0.6445 610584
0.4551 5.0 2680 0.6310 763216
0.324 6.0 3216 0.6467 915528
0.4477 7.0 3752 0.6731 1067904
0.303 8.0 4288 0.7115 1221016
0.2757 9.0 4824 0.7377 1373032
0.7593 10.0 5360 0.7847 1525104
0.2469 11.0 5896 0.8181 1677680
0.1783 12.0 6432 0.9106 1830200
0.1514 13.0 6968 1.0473 1982664
0.0633 14.0 7504 1.0969 2135168
0.0299 15.0 8040 1.1955 2287232
0.0265 16.0 8576 1.2349 2438992
0.02 17.0 9112 1.2547 2591432
0.0704 18.0 9648 1.2614 2744944
0.0371 19.0 10184 1.2645 2897552
0.0153 20.0 10720 1.2632 3049984

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_42_1760637548

Adapter
(2187)
this model