train_conala_456_1760637783

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7393
  • Num Input Tokens Seen: 3043720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.5981 1.0 536 2.2582 152088
0.9095 2.0 1072 1.1626 303784
0.8483 3.0 1608 0.9730 456552
0.8116 4.0 2144 0.8966 608704
0.7813 5.0 2680 0.8562 761184
1.0255 6.0 3216 0.8292 912912
0.7137 7.0 3752 0.8086 1065128
0.7207 8.0 4288 0.7931 1216496
0.9265 9.0 4824 0.7808 1368880
0.5225 10.0 5360 0.7696 1522016
0.7305 11.0 5896 0.7616 1674136
1.066 12.0 6432 0.7548 1826160
0.7107 13.0 6968 0.7501 1978984
0.5841 14.0 7504 0.7460 2130656
0.4122 15.0 8040 0.7440 2282720
0.8114 16.0 8576 0.7409 2434896
0.922 17.0 9112 0.7403 2586968
0.7926 18.0 9648 0.7401 2738448
0.4927 19.0 10184 0.7393 2891056
0.3725 20.0 10720 0.7396 3043720

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_456_1760637783

Adapter
(2124)
this model

Evaluation results