train_openbookqa_456_1760637800

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

Loss: 0.4463
Num Input Tokens Seen: 8491296

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 4
eval_batch_size: 4
seed: 456
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.5578	1.0	1116	0.4590	424016
0.4861	2.0	2232	0.4513	848912
0.8579	3.0	3348	0.4525	1273224
0.4859	4.0	4464	0.4496	1698344
0.5608	5.0	5580	0.4504	2122168
0.3947	6.0	6696	0.4495	2547696
0.6609	7.0	7812	0.4505	2971624
0.8257	8.0	8928	0.4511	3395952
0.8134	9.0	10044	0.4520	3821456
0.3343	10.0	11160	0.4507	4245816
0.1934	11.0	12276	0.4511	4670528
0.8753	12.0	13392	0.4468	5093960
0.6569	13.0	14508	0.4491	5519312
0.7334	14.0	15624	0.4489	5944728
0.526	15.0	16740	0.4508	6369088
0.638	16.0	17856	0.4495	6792536
0.5613	17.0	18972	0.4485	7217384
0.712	18.0	20088	0.4463	7642368
0.7204	19.0	21204	0.4488	8066544
0.2551	20.0	22320	0.4520	8491296

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: 1

Model tree for rbelanec/train_openbookqa_456_1760637800

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2393)

this model