train_codealpacapy_1754507521

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4833
  • Num Input Tokens Seen: 12472912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6575 0.5 954 0.6278 616992
0.6347 1.0 1908 0.5514 1248304
0.5265 1.5 2862 0.5263 1877040
0.4717 2.0 3816 0.5128 2497016
0.6225 2.5 4770 0.5046 3129368
0.4337 3.0 5724 0.4989 3742552
0.4868 3.5 6678 0.4947 4361944
0.4298 4.0 7632 0.4916 4985200
0.5116 4.5 8586 0.4893 5611760
0.494 5.0 9540 0.4879 6233920
0.4493 5.5 10494 0.4866 6849184
0.4835 6.0 11448 0.4856 7478504
0.3614 6.5 12402 0.4851 8083560
0.4523 7.0 13356 0.4842 8722744
0.4158 7.5 14310 0.4838 9345976
0.7811 8.0 15264 0.4836 9977520
0.2548 8.5 16218 0.4834 10604656
0.4824 9.0 17172 0.4833 11225416
0.3894 9.5 18126 0.4835 11845704
0.6659 10.0 19080 0.4833 12472912

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_1754507521

Adapter
(2099)
this model

Evaluation results