rbelanec's picture
End of training
b00054a verified
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-factory
  - lntuning
  - generated_from_trainer
model-index:
  - name: train_codealpacapy_123_1762561812
    results: []

train_codealpacapy_123_1762561812

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4750
  • Num Input Tokens Seen: 24941912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6523 1.0 1908 0.5841 1248304
0.4842 2.0 3816 0.5255 2497016
0.4395 3.0 5724 0.5037 3742552
0.4343 4.0 7632 0.4937 4985200
0.4944 5.0 9540 0.4878 6233920
0.483 6.0 11448 0.4841 7478504
0.4485 7.0 13356 0.4816 8722744
0.7736 8.0 15264 0.4795 9977520
0.471 9.0 17172 0.4779 11225416
0.6523 10.0 19080 0.4767 12472912
0.5687 11.0 20988 0.4762 13721824
0.4424 12.0 22896 0.4755 14970528
0.5622 13.0 24804 0.4754 16220808
0.3655 14.0 26712 0.4759 17464792
0.3788 15.0 28620 0.4752 18706976
0.3818 16.0 30528 0.4753 19956544
0.4117 17.0 32436 0.4751 21204416
0.4239 18.0 34344 0.4751 22451928
0.3768 19.0 36252 0.4752 23696296
0.614 20.0 38160 0.4750 24941912

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1