End of training

6061ce0 verified 7 months ago

2.3 kB

library_name: peft
license: llama3
base_model: meta-llama/Meta-Llama-3-8B-Instruct
tags:
  - base_model:adapter:meta-llama/Meta-Llama-3-8B-Instruct
  - llama-factory
  - transformers
pipeline_tag: text-generation
model-index:
  - name: train_copa_456_1760637759
    results: []

train_copa_456_1760637759

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

Loss: 1.0934
Num Input Tokens Seen: 501440

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 4
eval_batch_size: 4
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.2344	2.0	160	0.2388	50080
0.2424	4.0	320	0.2363	100096
0.226	6.0	480	0.2417	150400
0.203	8.0	640	0.2646	200704
0.2158	10.0	800	0.3807	250592
0.013	12.0	960	0.5269	300832
0.0174	14.0	1120	0.8448	350976
0.0003	16.0	1280	0.9965	401184
0.0002	18.0	1440	1.0806	451328
0.0003	20.0	1600	1.0934	501440

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4