DisambertSingleSense-base

This model is a fine-tuned version of answerdotai/ModernBERT-base on the semcor dataset. It achieves the following results on the evaluation set:

  • Loss: 10.3845
  • Precision: 0.9250
  • Recall: 0.5786
  • F1: 0.7119
  • Accuracy: 0.6008
  • Matthews: 0.6006

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: inverse_sqrt
  • lr_scheduler_warmup_steps: 1000
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Accuracy Matthews
No log 0 0 207.0982 0.0 0.0 0.0 0.0 -0.0000
11.1217 1.0 14014 15.0481 0.9215 0.5209 0.6656 0.4562 0.4558
5.9994 2.0 28028 10.3853 0.7928 0.3539 0.4894 0.4978 0.4979
3.7236 3.0 42042 8.8450 0.9086 0.5679 0.6989 0.5570 0.5566
2.5493 4.0 56056 8.7346 0.9313 0.5675 0.7053 0.5793 0.5790
1.9121 5.0 70070 8.9990 0.9163 0.5669 0.7004 0.5701 0.5698
0.9166 6.0 84084 9.2895 0.9287 0.5799 0.7139 0.5815 0.5812
0.8231 7.0 98098 9.3043 0.9185 0.5844 0.7143 0.5907 0.5904
0.4919 8.0 112112 9.7527 0.9216 0.5668 0.7019 0.5802 0.5799
0.5579 9.0 126126 9.9372 0.9265 0.5745 0.7092 0.5929 0.5926
0.3221 10.0 140140 10.1643 0.9254 0.5726 0.7074 0.5868 0.5865
0.4007 11.0 154154 10.1666 0.9077 0.5722 0.7019 0.5885 0.5882
0.1726 12.0 168168 10.3202 0.9179 0.5691 0.7026 0.5894 0.5891
0.2729 13.0 182182 10.4281 0.9127 0.5648 0.6978 0.5916 0.5913
0.1867 14.0 196196 10.3487 0.9042 0.5731 0.7016 0.5951 0.5948
0.1512 15.0 210210 10.2347 0.9262 0.5742 0.7089 0.5968 0.5966
0.1377 16.0 224224 10.3734 0.9211 0.5772 0.7097 0.6017 0.6014
0.2627 17.0 238238 10.5554 0.9212 0.5767 0.7093 0.5990 0.5988
0.1610 18.0 252252 10.4423 0.9273 0.5748 0.7097 0.6008 0.6006
0.1973 19.0 266266 10.6396 0.9289 0.5729 0.7087 0.5947 0.5945
0.1504 20.0 280280 10.5432 0.9132 0.5740 0.7049 0.5995 0.5992
0.0363 21.0 294294 10.6388 0.9291 0.5744 0.7099 0.5986 0.5984
0.0384 22.0 308308 10.5433 0.9314 0.5750 0.7111 0.5977 0.5975
0.0792 23.0 322322 10.7152 0.9308 0.5752 0.7110 0.5995 0.5994
0.0165 24.0 336336 10.6516 0.9301 0.5690 0.7061 0.5964 0.5962
0.0644 25.0 350350 10.3666 0.9297 0.5788 0.7134 0.6012 0.6010
0.0246 26.0 364364 10.3480 0.9285 0.5700 0.7064 0.5947 0.5945
0.0518 27.0 378378 10.6784 0.9300 0.5783 0.7131 0.5977 0.5975
0.0267 28.0 392392 10.7434 0.9306 0.5742 0.7102 0.5999 0.5998
0.0847 29.0 406406 10.4787 0.9289 0.5787 0.7131 0.6017 0.6014
0.0923 30.0 420420 10.3845 0.9250 0.5786 0.7119 0.6008 0.6006

Framework versions

  • Transformers 5.1.0
  • Pytorch 2.6.0+cu124
  • Datasets 4.5.0
  • Tokenizers 0.22.2
Downloads last month
49
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for PeteBleackley/trainer_output

Finetuned
(1086)
this model