train_codealpacapy_42_1760705166

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4436
  • Num Input Tokens Seen: 24887720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.4698 1.0 1908 0.4576 1243048
0.3335 2.0 3816 0.4475 2489456
0.3801 3.0 5724 0.4436 3733736
0.4428 4.0 7632 0.4494 4976128
0.4908 5.0 9540 0.4700 6219592
0.3863 6.0 11448 0.5028 7467968
0.2981 7.0 13356 0.5452 8709168
0.2559 8.0 15264 0.6192 9958360
0.2353 9.0 17172 0.6835 11204000
0.1702 10.0 19080 0.7745 12446408
0.0753 11.0 20988 0.9030 13691904
0.0887 12.0 22896 1.0084 14937216
0.0529 13.0 24804 1.1341 16179624
0.0328 14.0 26712 1.2237 17425368
0.0106 15.0 28620 1.3184 18668536
0.0104 16.0 30528 1.4355 19916008
0.0065 17.0 32436 1.5085 21158120
0.0074 18.0 34344 1.5513 22400368
0.0034 19.0 36252 1.5742 23645440
0.0014 20.0 38160 1.5826 24887720

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_42_1760705166

Adapter
(2101)
this model

Evaluation results