train_codealpacapy_123_1762561812

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4750
  • Num Input Tokens Seen: 24941912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6523 1.0 1908 0.5841 1248304
0.4842 2.0 3816 0.5255 2497016
0.4395 3.0 5724 0.5037 3742552
0.4343 4.0 7632 0.4937 4985200
0.4944 5.0 9540 0.4878 6233920
0.483 6.0 11448 0.4841 7478504
0.4485 7.0 13356 0.4816 8722744
0.7736 8.0 15264 0.4795 9977520
0.471 9.0 17172 0.4779 11225416
0.6523 10.0 19080 0.4767 12472912
0.5687 11.0 20988 0.4762 13721824
0.4424 12.0 22896 0.4755 14970528
0.5622 13.0 24804 0.4754 16220808
0.3655 14.0 26712 0.4759 17464792
0.3788 15.0 28620 0.4752 18706976
0.3818 16.0 30528 0.4753 19956544
0.4117 17.0 32436 0.4751 21204416
0.4239 18.0 34344 0.4751 22451928
0.3768 19.0 36252 0.4752 23696296
0.614 20.0 38160 0.4750 24941912

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_codealpacapy_123_1762561812

Adapter
(2098)
this model

Evaluation results