rbelanec's picture
End of training
ab9b660 verified
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-factory
  - ia3
  - generated_from_trainer
model-index:
  - name: train_codealpacapy_123_1762572064
    results: []

train_codealpacapy_123_1762572064

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4916
  • Num Input Tokens Seen: 24941912

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7537 1.0 1908 0.7072 1248304
0.5438 2.0 3816 0.5793 2497016
0.4952 3.0 5724 0.5439 3742552
0.4696 4.0 7632 0.5270 4985200
0.5107 5.0 9540 0.5173 6233920
0.5222 6.0 11448 0.5106 7478504
0.4821 7.0 13356 0.5058 8722744
0.8077 8.0 15264 0.5022 9977520
0.5135 9.0 17172 0.4996 11225416
0.6831 10.0 19080 0.4971 12472912
0.5967 11.0 20988 0.4956 13721824
0.4765 12.0 22896 0.4947 14970528
0.6152 13.0 24804 0.4937 16220808
0.4111 14.0 26712 0.4929 17464792
0.4217 15.0 28620 0.4923 18706976
0.4192 16.0 30528 0.4920 19956544
0.4315 17.0 32436 0.4919 21204416
0.4564 18.0 34344 0.4918 22451928
0.4151 19.0 36252 0.4916 23696296
0.6589 20.0 38160 0.4917 24941912

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1