results

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 64
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 3
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
0.2569	0.1280	50	0.2415	0.9041	0.8996
0.2055	0.2559	100	0.2228	0.9157	0.9194
0.2424	0.3839	150	0.1831	0.9311	0.9310
0.2092	0.5118	200	0.1808	0.9313	0.9316
0.1878	0.6398	250	0.1993	0.9244	0.9273
0.2077	0.7678	300	0.1705	0.9357	0.9367
0.1937	0.8957	350	0.1847	0.9282	0.9310
0.1448	1.0230	400	0.1701	0.9366	0.9361
0.1034	1.1510	450	0.1763	0.9403	0.9403
0.1395	1.2790	500	0.1854	0.9396	0.9401
0.1141	1.4069	550	0.1774	0.9389	0.9381
0.1323	1.5349	600	0.1716	0.9389	0.9377
0.1824	1.6628	650	0.1866	0.9381	0.9398
0.133	1.7908	700	0.1716	0.9415	0.9414
0.1054	1.9187	750	0.1651	0.944	0.9442
0.0473	2.0461	800	0.1755	0.944	0.9440
0.0458	2.1740	850	0.1917	0.9426	0.9427
0.1082	2.3020	900	0.2014	0.9418	0.9424
0.0621	2.4299	950	0.2019	0.9416	0.9418
0.0773	2.5579	1000	0.1988	0.9412	0.9411
0.1104	2.6859	1050	0.2031	0.9418	0.9413
0.079	2.8138	1100	0.1962	0.9431	0.9432
0.0717	2.9418	1150	0.1958	0.9431	0.9434