train_openbookqa_789_1760637914

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

Loss: 0.1975
Num Input Tokens Seen: 8499784

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.5986	1.0	1116	0.2197	425448
0.0952	2.0	2232	0.1975	849792
0.0028	3.0	3348	0.2483	1274096
0.002	4.0	4464	0.2404	1699224
0.0001	5.0	5580	0.4181	2123824
0.0003	6.0	6696	0.3511	2548008
0.0002	7.0	7812	0.3808	2972312
0.0	8.0	8928	0.5220	3396920
0.0023	9.0	10044	0.4886	3822320
0.0	10.0	11160	0.6215	4247968
0.0	11.0	12276	0.4707	4672584
0.0	12.0	13392	0.5171	5097552
0.0	13.0	14508	0.5367	5522440
0.0	14.0	15624	0.5588	5947656
0.0	15.0	16740	0.5737	6373520
0.0	16.0	17856	0.5839	6798816
0.0	17.0	18972	0.5960	7223992
0.0	18.0	20088	0.6117	7649768
0.0	19.0	21204	0.6109	8074496
0.0	20.0	22320	0.6127	8499784

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 2

Model tree for rbelanec/train_openbookqa_789_1760637914

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2393)

this model