rbelanec's picture
End of training
77a2e6e verified
metadata
library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - llama-factory
  - prefix-tuning
  - generated_from_trainer
model-index:
  - name: train_codealpacapy_42_1760623957
    results: []

train_codealpacapy_42_1760623957

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5999
  • Num Input Tokens Seen: 22157104

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.5247 2.0 3392 0.5012 2222792
0.6073 4.0 6784 0.4833 4435584
0.4532 6.0 10176 0.4830 6643816
0.4371 8.0 13568 0.4921 8867608
0.3561 10.0 16960 0.5087 11077216
0.4704 12.0 20352 0.5304 13294344
0.3772 14.0 23744 0.5604 15504440
0.4784 16.0 27136 0.5757 17721880
0.3258 18.0 30528 0.5942 19940536
0.3676 20.0 33920 0.5999 22157104

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1