results

This model is a fine-tuned version of distilbert-base-cased on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 10
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 5

Training Loss	Epoch	Step	Validation Loss	Accuracy
2.0050	0.1562	100	1.7880	0.3731
1.5534	0.3125	200	1.4013	0.4919
1.3653	0.4688	300	1.3551	0.5238
1.3075	0.625	400	1.3244	0.54
1.3349	0.7812	500	1.2947	0.5469
1.2770	0.9375	600	1.2404	0.5456
1.1652	1.0938	700	1.3225	0.5425
0.9598	1.25	800	1.2530	0.5637
0.9209	1.4062	900	1.2938	0.5687
0.9700	1.5625	1000	1.2150	0.57
0.9724	1.7188	1100	1.2234	0.5675
0.9007	1.875	1200	1.2194	0.5713
0.8538	2.0312	1300	1.2109	0.595
0.5652	2.1875	1400	1.3418	0.5787
0.5980	2.3438	1500	1.3551	0.5794
0.5600	2.5	1600	1.4324	0.5863
0.5926	2.6562	1700	1.3879	0.5794
0.5105	2.8125	1800	1.4221	0.5731
0.6133	2.9688	1900	1.4057	0.58
0.3847	3.125	2000	1.4601	0.5737
0.3171	3.2812	2100	1.5700	0.5744
0.3563	3.4375	2200	1.5884	0.5844
0.2698	3.5938	2300	1.6227	0.5825
0.2940	3.75	2400	1.6997	0.5775
0.3423	3.9062	2500	1.6714	0.5887
0.2651	4.0625	2600	1.6928	0.585
0.1350	4.2188	2700	1.7678	0.5781
0.1871	4.375	2800	1.8173	0.5763
0.1858	4.5312	2900	1.8271	0.5813
0.1923	4.6875	3000	1.8298	0.5875
0.1377	4.8438	3100	1.8490	0.5819
0.1464	5.0	3200	1.8574	0.5794

Safetensors

Model size

65.8M params

Tensor type

F32

Base model

Finetuned

(344)

this model