Assignment4-modernbert-optuna-tuned

This model is a fine-tuned version of answerdotai/ModernBERT-base on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 32
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
num_epochs: 8

Training Loss	Epoch	Step	Validation Loss	Accuracy
12.0069	0.2096	100	5.0978	0.7661
3.4684	0.4193	200	2.0509	0.9206
1.8229	0.6289	300	1.5597	0.9452
1.4306	0.8386	400	1.2236	0.9535
1.1	1.0482	500	1.0535	0.9613
0.8086	1.2579	600	0.8738	0.9665
0.7382	1.4675	700	0.8479	0.9597
0.6412	1.6771	800	0.7502	0.9677
0.6152	1.8868	900	0.7203	0.9687
0.5391	2.0964	1000	0.6897	0.9652
0.4475	2.3061	1100	0.6466	0.9713
0.4256	2.5157	1200	0.6227	0.9674
0.3984	2.7254	1300	0.5820	0.9719
0.3809	2.9350	1400	0.5644	0.9710
0.3424	3.1447	1500	0.5406	0.9710
0.3176	3.3543	1600	0.5278	0.9723
0.2981	3.5639	1700	0.5222	0.9719
0.2939	3.7736	1800	0.4962	0.9729
0.2858	3.9832	1900	0.4803	0.9739
0.2494	4.1929	2000	0.4714	0.9742
0.235	4.4025	2100	0.4656	0.9729
0.2318	4.6122	2200	0.4559	0.9735
0.2298	4.8218	2300	0.4485	0.9723
0.2276	5.0314	2400	0.4356	0.9752
0.1969	5.2411	2500	0.4308	0.9729
0.1937	5.4507	2600	0.4225	0.9739
0.1883	5.6604	2700	0.4213	0.9739
0.1873	5.8700	2800	0.4149	0.9748
0.181	6.0797	2900	0.4126	0.9758
0.1725	6.2893	3000	0.4111	0.9739
0.1649	6.4990	3100	0.4092	0.9752
0.1638	6.7086	3200	0.4071	0.9742
0.1614	6.9182	3300	0.4064	0.9742
0.1591	7.1279	3400	0.4058	0.9735
0.1541	7.3375	3500	0.4061	0.9735
0.157	7.5472	3600	0.4053	0.9739
0.1564	7.7568	3700	0.4052	0.9739
0.1541	7.9665	3800	0.4052	0.9739

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

this model