rbelanec's picture
End of training
c50277f verified
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-factory
  - lntuning
  - generated_from_trainer
model-index:
  - name: train_codealpacapy_1754507521
    results: []

train_codealpacapy_1754507521

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4833
  • Num Input Tokens Seen: 12472912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6575 0.5 954 0.6278 616992
0.6347 1.0 1908 0.5514 1248304
0.5265 1.5 2862 0.5263 1877040
0.4717 2.0 3816 0.5128 2497016
0.6225 2.5 4770 0.5046 3129368
0.4337 3.0 5724 0.4989 3742552
0.4868 3.5 6678 0.4947 4361944
0.4298 4.0 7632 0.4916 4985200
0.5116 4.5 8586 0.4893 5611760
0.494 5.0 9540 0.4879 6233920
0.4493 5.5 10494 0.4866 6849184
0.4835 6.0 11448 0.4856 7478504
0.3614 6.5 12402 0.4851 8083560
0.4523 7.0 13356 0.4842 8722744
0.4158 7.5 14310 0.4838 9345976
0.7811 8.0 15264 0.4836 9977520
0.2548 8.5 16218 0.4834 10604656
0.4824 9.0 17172 0.4833 11225416
0.3894 9.5 18126 0.4835 11845704
0.6659 10.0 19080 0.4833 12472912

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1