DisambertSingleSense-base

This model is a fine-tuned version of answerdotai/ModernBERT-base on the semcor dataset. It achieves the following results on the evaluation set:

Loss: 10.3845
Precision: 0.9250
Recall: 0.5786
F1: 0.7119
Accuracy: 0.6008
Matthews: 0.6006

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: inverse_sqrt
lr_scheduler_warmup_steps: 1000
num_epochs: 30

Training results

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy	Matthews
No log	0	0	207.0982	0.0	0.0	0.0	0.0	-0.0000
11.1217	1.0	14014	15.0481	0.9215	0.5209	0.6656	0.4562	0.4558
5.9994	2.0	28028	10.3853	0.7928	0.3539	0.4894	0.4978	0.4979
3.7236	3.0	42042	8.8450	0.9086	0.5679	0.6989	0.5570	0.5566
2.5493	4.0	56056	8.7346	0.9313	0.5675	0.7053	0.5793	0.5790
1.9121	5.0	70070	8.9990	0.9163	0.5669	0.7004	0.5701	0.5698
0.9166	6.0	84084	9.2895	0.9287	0.5799	0.7139	0.5815	0.5812
0.8231	7.0	98098	9.3043	0.9185	0.5844	0.7143	0.5907	0.5904
0.4919	8.0	112112	9.7527	0.9216	0.5668	0.7019	0.5802	0.5799
0.5579	9.0	126126	9.9372	0.9265	0.5745	0.7092	0.5929	0.5926
0.3221	10.0	140140	10.1643	0.9254	0.5726	0.7074	0.5868	0.5865
0.4007	11.0	154154	10.1666	0.9077	0.5722	0.7019	0.5885	0.5882
0.1726	12.0	168168	10.3202	0.9179	0.5691	0.7026	0.5894	0.5891
0.2729	13.0	182182	10.4281	0.9127	0.5648	0.6978	0.5916	0.5913
0.1867	14.0	196196	10.3487	0.9042	0.5731	0.7016	0.5951	0.5948
0.1512	15.0	210210	10.2347	0.9262	0.5742	0.7089	0.5968	0.5966
0.1377	16.0	224224	10.3734	0.9211	0.5772	0.7097	0.6017	0.6014
0.2627	17.0	238238	10.5554	0.9212	0.5767	0.7093	0.5990	0.5988
0.1610	18.0	252252	10.4423	0.9273	0.5748	0.7097	0.6008	0.6006
0.1973	19.0	266266	10.6396	0.9289	0.5729	0.7087	0.5947	0.5945
0.1504	20.0	280280	10.5432	0.9132	0.5740	0.7049	0.5995	0.5992
0.0363	21.0	294294	10.6388	0.9291	0.5744	0.7099	0.5986	0.5984
0.0384	22.0	308308	10.5433	0.9314	0.5750	0.7111	0.5977	0.5975
0.0792	23.0	322322	10.7152	0.9308	0.5752	0.7110	0.5995	0.5994
0.0165	24.0	336336	10.6516	0.9301	0.5690	0.7061	0.5964	0.5962
0.0644	25.0	350350	10.3666	0.9297	0.5788	0.7134	0.6012	0.6010
0.0246	26.0	364364	10.3480	0.9285	0.5700	0.7064	0.5947	0.5945
0.0518	27.0	378378	10.6784	0.9300	0.5783	0.7131	0.5977	0.5975
0.0267	28.0	392392	10.7434	0.9306	0.5742	0.7102	0.5999	0.5998
0.0847	29.0	406406	10.4787	0.9289	0.5787	0.7131	0.6017	0.6014
0.0923	30.0	420420	10.3845	0.9250	0.5786	0.7119	0.6008	0.6006

Framework versions

Transformers 5.1.0
Pytorch 2.6.0+cu124
Datasets 4.5.0
Tokenizers 0.22.2

Downloads last month: 4

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for PeteBleackley/trainer_output

Base model

answerdotai/ModernBERT-base

Finetuned

(1231)

this model