eternis commited on
Commit
1bb1e6d
·
verified ·
1 Parent(s): dea38d1

Model save

Browse files
Files changed (2) hide show
  1. README.md +24 -48
  2. model.safetensors +1 -1
README.md CHANGED
@@ -16,19 +16,10 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.9091
20
- - Complexity Accuracy: 0.7714
21
- - Model Accuracy: 0.3943
22
- - Overall Accuracy: 0.2892
23
- - Comp Acc Class 0: 0.9333
24
- - Comp Acc Class 1: 0.7614
25
- - Comp Acc Class 2: 0.7030
26
- - Model Acc Class 0: 0.4166
27
- - Model Acc Class 1: 0.3263
28
- - Model Acc Class 2: 0.3706
29
- - Model Acc Class 3: 0.224
30
- - Complexity Macro F1: 0.7794
31
- - Model Macro F1: 0.2693
32
 
33
  ## Model description
34
 
@@ -47,50 +38,35 @@ More information needed
47
  ### Training hyperparameters
48
 
49
  The following hyperparameters were used during training:
50
- - learning_rate: 0.0005
51
- - train_batch_size: 16
52
  - eval_batch_size: 32
53
  - seed: 42
54
  - gradient_accumulation_steps: 2
55
- - total_train_batch_size: 32
56
  - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
57
  - lr_scheduler_type: cosine
58
- - lr_scheduler_warmup_ratio: 0.02
59
  - num_epochs: 10
60
 
61
  ### Training results
62
 
63
- | Training Loss | Epoch | Step | Validation Loss | Complexity Accuracy | Model Accuracy | Overall Accuracy | Comp Acc Class 0 | Comp Acc Class 1 | Comp Acc Class 2 | Model Acc Class 0 | Model Acc Class 1 | Model Acc Class 2 | Model Acc Class 3 | Complexity Macro F1 | Model Macro F1 |
64
- |:-------------:|:------:|:----:|:---------------:|:-------------------:|:--------------:|:----------------:|:----------------:|:----------------:|:----------------:|:-----------------:|:-----------------:|:-----------------:|:-----------------:|:-------------------:|:--------------:|
65
- | 0.9723 | 0.3429 | 300 | 0.9664 | 0.7293 | 0.3019 | 0.2105 | 0.9240 | 0.7113 | 0.6576 | 0.3374 | 0.0096 | 0.2388 | 0.88 | 0.7377 | 0.1978 |
66
- | 0.8984 | 0.6857 | 600 | 0.9349 | 0.7572 | 0.3106 | 0.2197 | 0.9194 | 0.8587 | 0.4901 | 0.3387 | 0.0077 | 0.3284 | 0.848 | 0.7460 | 0.2052 |
67
- | 0.8743 | 1.0286 | 900 | 0.9386 | 0.7537 | 0.3183 | 0.2244 | 0.9395 | 0.7604 | 0.6427 | 0.3286 | 0.1228 | 0.3731 | 0.712 | 0.7596 | 0.2343 |
68
- | 0.8401 | 1.3714 | 1200 | 0.9377 | 0.7649 | 0.3126 | 0.2234 | 0.9705 | 0.7984 | 0.5957 | 0.3498 | 0.0499 | 0.2139 | 0.84 | 0.7603 | 0.2145 |
69
- | 0.8264 | 1.7143 | 1500 | 0.8737 | 0.7527 | 0.3945 | 0.2717 | 0.9132 | 0.7178 | 0.7294 | 0.4321 | 0.1017 | 0.4303 | 0.608 | 0.7664 | 0.2638 |
70
- | 0.8233 | 2.0571 | 1800 | 0.8828 | 0.7781 | 0.3885 | 0.2867 | 0.9240 | 0.8508 | 0.5710 | 0.4152 | 0.2342 | 0.3557 | 0.504 | 0.7725 | 0.2724 |
71
- | 0.8009 | 2.4 | 2100 | 0.9305 | 0.7517 | 0.3484 | 0.2458 | 0.9488 | 0.7108 | 0.7195 | 0.3809 | 0.0672 | 0.3806 | 0.648 | 0.7620 | 0.2374 |
72
- | 0.7699 | 2.7429 | 2400 | 0.9213 | 0.7367 | 0.3674 | 0.2535 | 0.9473 | 0.6464 | 0.7855 | 0.3657 | 0.2361 | 0.5423 | 0.392 | 0.7546 | 0.2699 |
73
- | 0.7815 | 3.0857 | 2700 | 0.8605 | 0.7746 | 0.4279 | 0.3146 | 0.9767 | 0.8146 | 0.5957 | 0.4681 | 0.2476 | 0.3831 | 0.368 | 0.7728 | 0.2870 |
74
- | 0.7628 | 3.4286 | 3000 | 0.8406 | 0.7731 | 0.4620 | 0.3467 | 0.9442 | 0.7702 | 0.6873 | 0.5322 | 0.1593 | 0.3632 | 0.376 | 0.7801 | 0.2865 |
75
- | 0.7478 | 3.7714 | 3300 | 0.8976 | 0.7746 | 0.3701 | 0.2717 | 0.9426 | 0.8262 | 0.5932 | 0.3920 | 0.1939 | 0.4254 | 0.408 | 0.7721 | 0.2596 |
76
- | 0.7182 | 4.1143 | 3600 | 0.9086 | 0.7761 | 0.3726 | 0.2712 | 0.9395 | 0.8184 | 0.6139 | 0.3987 | 0.1843 | 0.4204 | 0.384 | 0.7756 | 0.2575 |
77
- | 0.7226 | 4.4571 | 3900 | 0.8708 | 0.7604 | 0.4182 | 0.2984 | 0.9132 | 0.7382 | 0.7186 | 0.4523 | 0.2169 | 0.4602 | 0.312 | 0.7717 | 0.2808 |
78
- | 0.7051 | 4.8 | 4200 | 0.9171 | 0.7517 | 0.3833 | 0.2648 | 0.9519 | 0.6719 | 0.7871 | 0.4024 | 0.2438 | 0.4254 | 0.376 | 0.7684 | 0.2700 |
79
- | 0.6826 | 5.1429 | 4500 | 0.8959 | 0.7626 | 0.3900 | 0.2854 | 0.9147 | 0.7535 | 0.6980 | 0.4102 | 0.2649 | 0.4328 | 0.296 | 0.7723 | 0.2708 |
80
- | 0.6957 | 5.4857 | 4800 | 0.8982 | 0.7719 | 0.4095 | 0.3014 | 0.8915 | 0.8049 | 0.6493 | 0.4439 | 0.2399 | 0.3930 | 0.352 | 0.7752 | 0.2771 |
81
- | 0.6686 | 5.8286 | 5100 | 0.8987 | 0.7631 | 0.4095 | 0.2939 | 0.9271 | 0.7437 | 0.7104 | 0.4493 | 0.2284 | 0.3856 | 0.296 | 0.7732 | 0.2716 |
82
- | 0.6681 | 6.1714 | 5400 | 0.8870 | 0.7694 | 0.4147 | 0.3046 | 0.9566 | 0.7558 | 0.6939 | 0.4580 | 0.2860 | 0.2960 | 0.304 | 0.7789 | 0.2731 |
83
- | 0.6526 | 6.5143 | 5700 | 0.8983 | 0.7746 | 0.3980 | 0.2892 | 0.9426 | 0.7692 | 0.6947 | 0.4243 | 0.2726 | 0.4055 | 0.272 | 0.7815 | 0.2714 |
84
- | 0.656 | 6.8571 | 6000 | 0.9080 | 0.7582 | 0.4 | 0.2889 | 0.9364 | 0.7164 | 0.7376 | 0.4216 | 0.2879 | 0.4527 | 0.184 | 0.7712 | 0.2708 |
85
- | 0.6421 | 7.2 | 6300 | 0.9028 | 0.7768 | 0.3893 | 0.2889 | 0.9349 | 0.8072 | 0.6386 | 0.4068 | 0.3551 | 0.3507 | 0.24 | 0.7791 | 0.2690 |
86
- | 0.6366 | 7.5429 | 6600 | 0.8946 | 0.7731 | 0.4087 | 0.3019 | 0.9504 | 0.7905 | 0.6477 | 0.4425 | 0.2399 | 0.4254 | 0.256 | 0.7764 | 0.2729 |
87
- | 0.6463 | 7.8857 | 6900 | 0.8931 | 0.7706 | 0.4092 | 0.2986 | 0.9380 | 0.7581 | 0.7038 | 0.4419 | 0.3186 | 0.3358 | 0.248 | 0.7790 | 0.2740 |
88
- | 0.6405 | 8.2286 | 7200 | 0.8919 | 0.7706 | 0.4110 | 0.3009 | 0.9380 | 0.7581 | 0.7038 | 0.4415 | 0.2802 | 0.4129 | 0.224 | 0.7784 | 0.2756 |
89
- | 0.6101 | 8.5714 | 7500 | 0.9054 | 0.7729 | 0.3988 | 0.2939 | 0.9302 | 0.7614 | 0.7096 | 0.4247 | 0.3225 | 0.3607 | 0.224 | 0.7813 | 0.2700 |
90
- | 0.6366 | 8.9143 | 7800 | 0.9038 | 0.7696 | 0.3995 | 0.2932 | 0.9240 | 0.7567 | 0.7104 | 0.4250 | 0.3186 | 0.3706 | 0.224 | 0.7778 | 0.2711 |
91
- | 0.6306 | 9.2571 | 8100 | 0.9099 | 0.7704 | 0.3915 | 0.2859 | 0.9240 | 0.7521 | 0.7211 | 0.4125 | 0.3186 | 0.3856 | 0.216 | 0.7794 | 0.2682 |
92
- | 0.6293 | 9.6 | 8400 | 0.9093 | 0.7716 | 0.3938 | 0.2889 | 0.9333 | 0.7618 | 0.7030 | 0.4159 | 0.3244 | 0.3731 | 0.224 | 0.7795 | 0.2691 |
93
- | 0.6177 | 9.9429 | 8700 | 0.9091 | 0.7714 | 0.3943 | 0.2892 | 0.9333 | 0.7614 | 0.7030 | 0.4166 | 0.3263 | 0.3706 | 0.224 | 0.7794 | 0.2693 |
94
 
95
 
96
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the None dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.6852
20
+ - Complexity Accuracy: 0.772
21
+ - Model Accuracy: 0.747
22
+ - Overall Accuracy: 0.5793
 
 
 
 
 
 
 
 
 
23
 
24
  ## Model description
25
 
 
38
  ### Training hyperparameters
39
 
40
  The following hyperparameters were used during training:
41
+ - learning_rate: 0.0001
42
+ - train_batch_size: 32
43
  - eval_batch_size: 32
44
  - seed: 42
45
  - gradient_accumulation_steps: 2
46
+ - total_train_batch_size: 64
47
  - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: cosine
49
+ - lr_scheduler_warmup_ratio: 0.01
50
  - num_epochs: 10
51
 
52
  ### Training results
53
 
54
+ | Training Loss | Epoch | Step | Validation Loss | Complexity Accuracy | Model Accuracy | Overall Accuracy |
55
+ |:-------------:|:------:|:----:|:---------------:|:-------------------:|:--------------:|:----------------:|
56
+ | 0.8284 | 0.6857 | 300 | 0.7391 | 0.7275 | 0.7475 | 0.5437 |
57
+ | 0.7657 | 1.3703 | 600 | 0.7173 | 0.7408 | 0.7478 | 0.5515 |
58
+ | 0.7398 | 2.0549 | 900 | 0.7099 | 0.7502 | 0.748 | 0.5595 |
59
+ | 0.7161 | 2.7406 | 1200 | 0.7037 | 0.7578 | 0.748 | 0.5645 |
60
+ | 0.7057 | 3.4251 | 1500 | 0.6973 | 0.7635 | 0.7468 | 0.569 |
61
+ | 0.7115 | 4.1097 | 1800 | 0.6927 | 0.764 | 0.748 | 0.5705 |
62
+ | 0.7214 | 4.7954 | 2100 | 0.6896 | 0.7672 | 0.7482 | 0.5755 |
63
+ | 0.7034 | 5.48 | 2400 | 0.6886 | 0.769 | 0.7472 | 0.5777 |
64
+ | 0.6935 | 6.1646 | 2700 | 0.6878 | 0.769 | 0.7478 | 0.577 |
65
+ | 0.7055 | 6.8503 | 3000 | 0.6867 | 0.7722 | 0.7465 | 0.5787 |
66
+ | 0.6983 | 7.5349 | 3300 | 0.6858 | 0.7728 | 0.7465 | 0.5797 |
67
+ | 0.7092 | 8.2194 | 3600 | 0.6849 | 0.774 | 0.747 | 0.5803 |
68
+ | 0.697 | 8.9051 | 3900 | 0.6851 | 0.7718 | 0.747 | 0.5787 |
69
+ | 0.6989 | 9.5897 | 4200 | 0.6852 | 0.772 | 0.747 | 0.5793 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70
 
71
 
72
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:10aec3531301f5f7e619f12e219f65b4037d1b191e38e98c3486ff79c92a7893
3
  size 597632156
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ddda65714e2f58101eb9cce10d5d5bdc15236e1a9f7cdf53455428d7a54cde39
3
  size 597632156