train_openbookqa_789_1760637912

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

Loss: 0.6930
Num Input Tokens Seen: 8499784

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.03
train_batch_size: 4
eval_batch_size: 4
seed: 789
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Input Tokens Seen
0.6957	1.0	1116	0.6953	425448
1.2373	2.0	2232	0.7495	849792
0.7262	3.0	3348	0.7077	1274096
0.6988	4.0	4464	0.6990	1699224
0.7491	5.0	5580	0.7081	2123824
0.6643	6.0	6696	0.7039	2548008
0.7368	7.0	7812	0.7055	2972312
0.7277	8.0	8928	0.6937	3396920
0.6918	9.0	10044	0.6966	3822320
0.7036	10.0	11160	0.7015	4247968
0.6841	11.0	12276	0.6979	4672584
0.6966	12.0	13392	0.7010	5097552
0.6707	13.0	14508	0.7004	5522440
0.6742	14.0	15624	0.6951	5947656
0.7082	15.0	16740	0.6936	6373520
0.7002	16.0	17856	0.6939	6798816
0.6733	17.0	18972	0.6939	7223992
0.6865	18.0	20088	0.6948	7649768
0.6795	19.0	21204	0.6949	8074496
0.6963	20.0	22320	0.6930	8499784

Framework versions

PEFT 0.17.1
Transformers 4.51.3
Pytorch 2.9.0+cu128
Datasets 4.0.0
Tokenizers 0.21.4

Downloads last month: -

Model tree for rbelanec/train_openbookqa_789_1760637912

Base model

meta-llama/Meta-Llama-3-8B-Instruct

Adapter

(2158)

this model

rbelanec
/

train_openbookqa_789_1760637912

train_openbookqa_789_1760637912

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for rbelanec/train_openbookqa_789_1760637912

Evaluation results