train_conala_789_1760637896

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the conala dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7424
  • Num Input Tokens Seen: 3037136

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 789
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.9899 1.0 536 2.7686 152296
3.1267 2.0 1072 2.7586 304440
3.6797 3.0 1608 2.7509 455928
2.7123 4.0 2144 2.7465 608072
2.7728 5.0 2680 2.7443 759296
2.5596 6.0 3216 2.7442 910984
3.1077 7.0 3752 2.7441 1062816
3.0818 8.0 4288 2.7452 1214520
2.7164 9.0 4824 2.7428 1366480
2.721 10.0 5360 2.7432 1518976
2.6167 11.0 5896 2.7453 1670320
2.4633 12.0 6432 2.7437 1822624
2.4093 13.0 6968 2.7427 1974336
2.4403 14.0 7504 2.7441 2126488
3.2799 15.0 8040 2.7443 2278280
3.0543 16.0 8576 2.7424 2430272
2.6215 17.0 9112 2.7448 2581848
2.8104 18.0 9648 2.7450 2733712
3.0447 19.0 10184 2.7437 2885208
2.625 20.0 10720 2.7437 3037136

Framework versions

  • PEFT 0.17.1
  • Transformers 4.51.3
  • Pytorch 2.9.0+cu128
  • Datasets 4.0.0
  • Tokenizers 0.21.4
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_conala_789_1760637896

Adapter
(2152)
this model

Evaluation results