results-qa

This model is a fine-tuned version of FelixYaw/results on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 3

Training Loss	Epoch	Step	Validation Loss
0.8941	0.1528	200	0.9000
0.6193	0.3056	400	0.8039
0.5347	0.4584	600	0.7290
0.477	0.6112	800	0.6702
0.4504	0.7639	1000	0.6334
0.4339	0.9167	1200	0.5981
0.399	1.0695	1400	0.5559
0.3659	1.2223	1600	0.5267
0.3558	1.3751	1800	0.5054
0.3465	1.5279	2000	0.4930
0.3283	1.6807	2200	0.4821
0.324	1.8335	2400	0.4750
0.3169	1.9862	2600	0.4641
0.3063	2.1390	2800	0.4478
0.2927	2.2918	3000	0.4368
0.2893	2.4446	3200	0.4417
0.2875	2.5974	3400	0.4300
0.2862	2.7502	3600	0.4234
0.2776	2.9030	3800	0.4240

Safetensors

Model size

81.9M params

Tensor type

F32

Base model

Finetuned

Finetuned

(3)

this model