simon-mellergaard commited on
Commit
debd75d
·
verified ·
1 Parent(s): c8aff6b

End of training

Browse files
Files changed (1) hide show
  1. README.md +21 -31
README.md CHANGED
@@ -19,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 0.9118
23
- - Accuracy: 0.7971
24
- - F1: 0.7939
25
 
26
  ## Model description
27
 
@@ -40,42 +40,32 @@ More information needed
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
- - learning_rate: 5e-05
44
- - train_batch_size: 32
45
- - eval_batch_size: 8
46
  - seed: 42
47
- - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: cosine
49
- - num_epochs: 4
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
54
  |:-------------:|:------:|:----:|:---------------:|:--------:|:------:|
55
- | 2.712 | 0.2096 | 100 | 2.6599 | 0.3745 | 0.3284 |
56
- | 2.0817 | 0.4193 | 200 | 2.2602 | 0.4513 | 0.4260 |
57
- | 1.6662 | 0.6289 | 300 | 1.9354 | 0.5181 | 0.4953 |
58
- | 1.4309 | 0.8386 | 400 | 1.6449 | 0.6019 | 0.5874 |
59
- | 1.1682 | 1.0482 | 500 | 1.4536 | 0.6487 | 0.6370 |
60
- | 0.8532 | 1.2579 | 600 | 1.3092 | 0.6845 | 0.6786 |
61
- | 0.7879 | 1.4675 | 700 | 1.2658 | 0.6961 | 0.6932 |
62
- | 0.6966 | 1.6771 | 800 | 1.1445 | 0.7339 | 0.7280 |
63
- | 0.6659 | 1.8868 | 900 | 1.1185 | 0.7365 | 0.7324 |
64
- | 0.498 | 2.0964 | 1000 | 1.0528 | 0.7487 | 0.7487 |
65
- | 0.4019 | 2.3061 | 1100 | 0.9889 | 0.7639 | 0.7612 |
66
- | 0.3754 | 2.5157 | 1200 | 0.9937 | 0.7755 | 0.7736 |
67
- | 0.3393 | 2.7254 | 1300 | 0.9694 | 0.7832 | 0.7799 |
68
- | 0.3505 | 2.9350 | 1400 | 0.9332 | 0.7881 | 0.7863 |
69
- | 0.2359 | 3.1447 | 1500 | 0.9247 | 0.7919 | 0.7896 |
70
- | 0.2304 | 3.3543 | 1600 | 0.9270 | 0.79 | 0.7861 |
71
- | 0.2077 | 3.5639 | 1700 | 0.9194 | 0.7932 | 0.7891 |
72
- | 0.2299 | 3.7736 | 1800 | 0.9127 | 0.7961 | 0.7930 |
73
- | 0.2427 | 3.9832 | 1900 | 0.9118 | 0.7971 | 0.7939 |
74
 
75
 
76
  ### Framework versions
77
 
78
- - Transformers 4.56.1
79
- - Pytorch 2.8.0+cu126
80
- - Datasets 4.0.0
81
- - Tokenizers 0.22.0
 
19
 
20
  This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.1681
23
+ - Accuracy: 0.9690
24
+ - F1: 0.9687
25
 
26
  ## Model description
27
 
 
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
+ - learning_rate: 7e-05
44
+ - train_batch_size: 64
45
+ - eval_batch_size: 16
46
  - seed: 42
47
+ - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: cosine
49
+ - num_epochs: 6
50
 
51
  ### Training results
52
 
53
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
54
  |:-------------:|:------:|:----:|:---------------:|:--------:|:------:|
55
+ | 2.3344 | 0.6276 | 150 | 0.5836 | 0.8506 | 0.8448 |
56
+ | 0.3067 | 1.2552 | 300 | 0.3733 | 0.9139 | 0.9111 |
57
+ | 0.2089 | 1.8828 | 450 | 0.2463 | 0.9474 | 0.9470 |
58
+ | 0.1132 | 2.5105 | 600 | 0.2390 | 0.9487 | 0.9486 |
59
+ | 0.0618 | 3.1381 | 750 | 0.2183 | 0.9587 | 0.9582 |
60
+ | 0.0456 | 3.7657 | 900 | 0.1987 | 0.9616 | 0.9611 |
61
+ | 0.0377 | 4.3933 | 1050 | 0.1871 | 0.9655 | 0.9650 |
62
+ | 0.0204 | 5.0209 | 1200 | 0.1688 | 0.9684 | 0.9681 |
63
+ | 0.0092 | 5.6485 | 1350 | 0.1681 | 0.9690 | 0.9687 |
 
 
 
 
 
 
 
 
 
 
64
 
65
 
66
  ### Framework versions
67
 
68
+ - Transformers 4.52.4
69
+ - Pytorch 2.6.0+cu124
70
+ - Datasets 3.6.0
71
+ - Tokenizers 0.21.2