simon-mellergaard
/

modernbert-clinc

@@ -19,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.9118
-- Accuracy: 0.7971
-- F1: 0.7939
 ## Model description
@@ -40,42 +40,32 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 32
-- eval_batch_size: 8
 - seed: 42
-- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
-- num_epochs: 4
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Accuracy | F1     |
 |:-------------:|:------:|:----:|:---------------:|:--------:|:------:|
-| 2.712         | 0.2096 | 100  | 2.6599          | 0.3745   | 0.3284 |
-| 2.0817        | 0.4193 | 200  | 2.2602          | 0.4513   | 0.4260 |
-| 1.6662        | 0.6289 | 300  | 1.9354          | 0.5181   | 0.4953 |
-| 1.4309        | 0.8386 | 400  | 1.6449          | 0.6019   | 0.5874 |
-| 1.1682        | 1.0482 | 500  | 1.4536          | 0.6487   | 0.6370 |
-| 0.8532        | 1.2579 | 600  | 1.3092          | 0.6845   | 0.6786 |
-| 0.7879        | 1.4675 | 700  | 1.2658          | 0.6961   | 0.6932 |
-| 0.6966        | 1.6771 | 800  | 1.1445          | 0.7339   | 0.7280 |
-| 0.6659        | 1.8868 | 900  | 1.1185          | 0.7365   | 0.7324 |
-| 0.498         | 2.0964 | 1000 | 1.0528          | 0.7487   | 0.7487 |
-| 0.4019        | 2.3061 | 1100 | 0.9889          | 0.7639   | 0.7612 |
-| 0.3754        | 2.5157 | 1200 | 0.9937          | 0.7755   | 0.7736 |
-| 0.3393        | 2.7254 | 1300 | 0.9694          | 0.7832   | 0.7799 |
-| 0.3505        | 2.9350 | 1400 | 0.9332          | 0.7881   | 0.7863 |
-| 0.2359        | 3.1447 | 1500 | 0.9247          | 0.7919   | 0.7896 |
-| 0.2304        | 3.3543 | 1600 | 0.9270          | 0.79     | 0.7861 |
-| 0.2077        | 3.5639 | 1700 | 0.9194          | 0.7932   | 0.7891 |
-| 0.2299        | 3.7736 | 1800 | 0.9127          | 0.7961   | 0.7930 |
-| 0.2427        | 3.9832 | 1900 | 0.9118          | 0.7971   | 0.7939 |
 ### Framework versions
-- Transformers 4.56.1
-- Pytorch 2.8.0+cu126
-- Datasets 4.0.0
-- Tokenizers 0.22.0

 This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.1681
+- Accuracy: 0.9690
+- F1: 0.9687
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 7e-05
+- train_batch_size: 64
+- eval_batch_size: 16
 - seed: 42
+- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
+- num_epochs: 6
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss | Accuracy | F1     |
 |:-------------:|:------:|:----:|:---------------:|:--------:|:------:|
+| 2.3344        | 0.6276 | 150  | 0.5836          | 0.8506   | 0.8448 |
+| 0.3067        | 1.2552 | 300  | 0.3733          | 0.9139   | 0.9111 |
+| 0.2089        | 1.8828 | 450  | 0.2463          | 0.9474   | 0.9470 |
+| 0.1132        | 2.5105 | 600  | 0.2390          | 0.9487   | 0.9486 |
+| 0.0618        | 3.1381 | 750  | 0.2183          | 0.9587   | 0.9582 |
+| 0.0456        | 3.7657 | 900  | 0.1987          | 0.9616   | 0.9611 |
+| 0.0377        | 4.3933 | 1050 | 0.1871          | 0.9655   | 0.9650 |
+| 0.0204        | 5.0209 | 1200 | 0.1688          | 0.9684   | 0.9681 |
+| 0.0092        | 5.6485 | 1350 | 0.1681          | 0.9690   | 0.9687 |
 ### Framework versions
+- Transformers 4.52.4
+- Pytorch 2.6.0+cu124
+- Datasets 3.6.0
+- Tokenizers 0.21.2