train_codealpacapy_42_1760729930

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4686
  • Num Input Tokens Seen: 24887720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7231 1.0 1908 0.6796 1243048
0.4163 2.0 3816 0.5497 2489456
0.4516 3.0 5724 0.5161 3733736
0.5496 4.0 7632 0.5008 4976128
0.6014 5.0 9540 0.4919 6219592
0.6577 6.0 11448 0.4860 7467968
0.515 7.0 13356 0.4815 8709168
0.5101 8.0 15264 0.4783 9958360
0.4895 9.0 17172 0.4756 11204000
0.5218 10.0 19080 0.4736 12446408
0.4234 11.0 20988 0.4722 13691904
0.4803 12.0 22896 0.4711 14937216
0.6942 13.0 24804 0.4704 16179624
0.4349 14.0 26712 0.4696 17425368
0.3951 15.0 28620 0.4692 18668536
0.5009 16.0 30528 0.4689 19916008
0.3044 17.0 32436 0.4687 21158120
0.5018 18.0 34344 0.4686 22400368
0.4636 19.0 36252 0.4686 23645440
0.3769 20.0 38160 0.4686 24887720

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_42_1760729930

Adapter
(2100)
this model

Evaluation results