rbelanec's picture
End of training
c28e6b6 verified
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-factory
  - prefix-tuning
  - generated_from_trainer
model-index:
  - name: train_codealpacapy_42_1760629239
    results: []

train_codealpacapy_42_1760629239

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5974
  • Num Input Tokens Seen: 22157104

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5308 2.0 3392 0.5023 2222792
0.6077 4.0 6784 0.4844 4435584
0.4561 6.0 10176 0.4838 6643816
0.4409 8.0 13568 0.4914 8867608
0.356 10.0 16960 0.5062 11077216
0.4671 12.0 20352 0.5286 13294344
0.3736 14.0 23744 0.5563 15504440
0.4832 16.0 27136 0.5748 17721880
0.3324 18.0 30528 0.5915 19940536
0.3648 20.0 33920 0.5974 22157104

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1