train_codealpacapy_42_1760719585

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4525
  • Num Input Tokens Seen: 24887720

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5784 1.0 1908 0.5529 1243048
0.3809 2.0 3816 0.4983 2489456
0.4314 3.0 5724 0.4791 3733736
0.5146 4.0 7632 0.4709 4976128
0.5703 5.0 9540 0.4652 6219592
0.616 6.0 11448 0.4614 7467968
0.487 7.0 13356 0.4585 8709168
0.4804 8.0 15264 0.4569 9958360
0.4684 9.0 17172 0.4552 11204000
0.4914 10.0 19080 0.4542 12446408
0.3796 11.0 20988 0.4539 13691904
0.4404 12.0 22896 0.4535 14937216
0.6154 13.0 24804 0.4534 16179624
0.4073 14.0 26712 0.4528 17425368
0.3595 15.0 28620 0.4529 18668536
0.4605 16.0 30528 0.4528 19916008
0.2781 17.0 32436 0.4528 21158120
0.4609 18.0 34344 0.4528 22400368
0.4204 19.0 36252 0.4525 23645440
0.3394 20.0 38160 0.4527 24887720

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_42_1760719585

Adapter
(2100)
this model

Evaluation results