train_openbookqa_42_1760637567

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

Loss: 0.6899
Num Input Tokens Seen: 8500696

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 4
eval_batch_size: 4
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.736	1.0	1116	0.7030	425624
0.7591	2.0	2232	0.7114	851112
0.7271	3.0	3348	0.6941	1276656
0.6952	4.0	4464	0.6906	1701464
0.7029	5.0	5580	0.6903	2126656
0.7276	6.0	6696	0.6975	2551992
0.6915	7.0	7812	0.6939	2977616
0.7208	8.0	8928	0.6904	3402128
0.6811	9.0	10044	0.7024	3826952
0.7336	10.0	11160	0.7047	4252632
0.6846	11.0	12276	0.6924	4677504
0.6998	12.0	13392	0.7031	5103104
0.7029	13.0	14508	0.6899	5527832
0.7001	14.0	15624	0.6942	5952336
0.659	15.0	16740	0.6903	6376760
0.6999	16.0	17856	0.6912	6801440
0.6772	17.0	18972	0.6912	7226000
0.686	18.0	20088	0.6917	7651232
0.6879	19.0	21204	0.6906	8076208
0.6987	20.0	22320	0.6921	8500696

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 2

Model tree for rbelanec/train_openbookqa_42_1760637567

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2394)

this model