train_conala_42_1760637549

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 2.4588
  • Num Input Tokens Seen: 3049984

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.8366 1.0 536 0.6933 153352
0.6292 2.0 1072 0.6899 305496
0.4186 3.0 1608 0.6457 458160
0.5617 4.0 2144 0.6498 610584
0.5602 5.0 2680 0.6488 763216
0.4184 6.0 3216 0.6422 915528
0.5468 7.0 3752 0.6684 1067904
0.429 8.0 4288 0.6670 1221016
0.4701 9.0 4824 0.6892 1373032
0.9467 10.0 5360 0.6917 1525104
0.5869 11.0 5896 0.7297 1677680
0.3809 12.0 6432 0.7218 1830200
0.301 13.0 6968 0.7636 1982664
0.1776 14.0 7504 0.7891 2135168
0.3594 15.0 8040 0.8015 2287232
0.2563 16.0 8576 0.8285 2438992
0.124 17.0 9112 0.8634 2591432
0.3369 18.0 9648 0.8991 2744944
0.3073 19.0 10184 0.9123 2897552
0.2325 20.0 10720 0.9179 3049984

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_42_1760637549

Adapter
(2154)
this model

Evaluation results