train_conala_42_1760637552

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7380
  • Num Input Tokens Seen: 3049984

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.7043 1.0 536 2.3140 153352
1.0339 2.0 1072 1.1814 305496
0.6569 3.0 1608 0.9902 458160
0.8537 4.0 2144 0.9015 610584
0.782 5.0 2680 0.8551 763216
0.6579 6.0 3216 0.8257 915528
0.8493 7.0 3752 0.8033 1067904
0.6719 8.0 4288 0.7876 1221016
0.7793 9.0 4824 0.7757 1373032
1.5815 10.0 5360 0.7658 1525104
0.8994 11.0 5896 0.7571 1677680
0.7936 12.0 6432 0.7522 1830200
0.6393 13.0 6968 0.7468 1982664
0.4894 14.0 7504 0.7442 2135168
0.7575 15.0 8040 0.7411 2287232
0.5909 16.0 8576 0.7401 2438992
0.5283 17.0 9112 0.7387 2591432
0.8665 18.0 9648 0.7380 2744944
0.7856 19.0 10184 0.7385 2897552
0.597 20.0 10720 0.7382 3049984

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_42_1760637552

Adapter
(2393)
this model