harun27 commited on
Commit
8f63d18
·
verified ·
1 Parent(s): 8f02db8

End of training

Browse files
Files changed (1) hide show
  1. README.md +12 -14
README.md CHANGED
@@ -16,8 +16,8 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
- - Loss: 0.2458
20
- - Classification Report: {'0': {'precision': 0.930564166150031, 'recall': 0.9428391959798995, 'f1-score': 0.9366614664586583, 'support': 1592.0}, '1': {'precision': 0.6192468619246861, 'recall': 0.5692307692307692, 'f1-score': 0.593186372745491, 'support': 260.0}, 'accuracy': 0.8903887688984882, 'macro avg': {'precision': 0.7749055140373586, 'recall': 0.7560349826053343, 'f1-score': 0.7649239196020747, 'support': 1852.0}, 'weighted avg': {'precision': 0.8868587130730388, 'recall': 0.8903887688984882, 'f1-score': 0.8884414209049739, 'support': 1852.0}}
21
 
22
  ## Model description
23
 
@@ -37,24 +37,22 @@ More information needed
37
 
38
  The following hyperparameters were used during training:
39
  - learning_rate: 5e-06
40
- - train_batch_size: 128
41
- - eval_batch_size: 128
42
  - seed: 42
43
- - distributed_type: multi-GPU
44
- - num_devices: 2
45
- - total_train_batch_size: 256
46
- - total_eval_batch_size: 256
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
- - num_epochs: 3
50
 
51
  ### Training results
52
 
53
- | Training Loss | Epoch | Step | Validation Loss | Classification Report |
54
- |:-------------:|:-----:|:----:|:---------------:|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
55
- | No log | 1.0 | 25 | 0.2760 | {'0': {'precision': 0.8786867000556483, 'recall': 0.9918341708542714, 'f1-score': 0.9318383003835939, 'support': 1592.0}, '1': {'precision': 0.7636363636363637, 'recall': 0.16153846153846155, 'f1-score': 0.26666666666666666, 'support': 260.0}, 'accuracy': 0.8752699784017278, 'macro avg': {'precision': 0.8211615318460059, 'recall': 0.5766863161963665, 'f1-score': 0.5992524835251303, 'support': 1852.0}, 'weighted avg': {'precision': 0.8625349249643881, 'recall': 0.8752699784017278, 'f1-score': 0.8384556736198784, 'support': 1852.0}} |
56
- | No log | 2.0 | 50 | 0.2626 | {'0': {'precision': 0.8689240851993446, 'recall': 0.9993718592964824, 'f1-score': 0.9295939234589541, 'support': 1592.0}, '1': {'precision': 0.9523809523809523, 'recall': 0.07692307692307693, 'f1-score': 0.1423487544483986, 'support': 260.0}, 'accuracy': 0.8698704103671706, 'macro avg': {'precision': 0.9106525187901484, 'recall': 0.5381474681097796, 'f1-score': 0.5359713389536763, 'support': 1852.0}, 'weighted avg': {'precision': 0.8806404920390952, 'recall': 0.8698704103671706, 'f1-score': 0.81907354336028, 'support': 1852.0}} |
57
- | No log | 3.0 | 75 | 0.2458 | {'0': {'precision': 0.930564166150031, 'recall': 0.9428391959798995, 'f1-score': 0.9366614664586583, 'support': 1592.0}, '1': {'precision': 0.6192468619246861, 'recall': 0.5692307692307692, 'f1-score': 0.593186372745491, 'support': 260.0}, 'accuracy': 0.8903887688984882, 'macro avg': {'precision': 0.7749055140373586, 'recall': 0.7560349826053343, 'f1-score': 0.7649239196020747, 'support': 1852.0}, 'weighted avg': {'precision': 0.8868587130730388, 'recall': 0.8903887688984882, 'f1-score': 0.8884414209049739, 'support': 1852.0}} |
 
 
58
 
59
 
60
  ### Framework versions
 
16
 
17
  This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
18
  It achieves the following results on the evaluation set:
19
+ - Loss: 0.1851
20
+ - Classification Report: {'0': {'precision': 0.9412477286493035, 'recall': 0.9761306532663316, 'f1-score': 0.9583718778908418, 'support': 1592.0}, '1': {'precision': 0.8109452736318408, 'recall': 0.6269230769230769, 'f1-score': 0.7071583514099783, 'support': 260.0}, 'accuracy': 0.9271058315334774, 'macro avg': {'precision': 0.8760965011405721, 'recall': 0.8015268650947043, 'f1-score': 0.83276511465041, 'support': 1852.0}, 'weighted avg': {'precision': 0.9229547274049513, 'recall': 0.9271058315334774, 'f1-score': 0.9231043201775456, 'support': 1852.0}}
21
 
22
  ## Model description
23
 
 
37
 
38
  The following hyperparameters were used during training:
39
  - learning_rate: 5e-06
40
+ - train_batch_size: 64
41
+ - eval_batch_size: 64
42
  - seed: 42
 
 
 
 
43
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
44
  - lr_scheduler_type: linear
45
+ - num_epochs: 5
46
 
47
  ### Training results
48
 
49
+ | Training Loss | Epoch | Step | Validation Loss | Classification Report |
50
+ |:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
51
+ | No log | 1.0 | 98 | 0.2211 | {'0': {'precision': 0.9382716049382716, 'recall': 0.9547738693467337, 'f1-score': 0.9464508094645081, 'support': 1592.0}, '1': {'precision': 0.6896551724137931, 'recall': 0.6153846153846154, 'f1-score': 0.6504065040650406, 'support': 260.0}, 'accuracy': 0.9071274298056156, 'macro avg': {'precision': 0.8139633886760324, 'recall': 0.7850792423656745, 'f1-score': 0.7984286567647744, 'support': 1852.0}, 'weighted avg': {'precision': 0.9033686500482261, 'recall': 0.9071274298056156, 'f1-score': 0.9048895138900688, 'support': 1852.0}} |
52
+ | No log | 2.0 | 196 | 0.2076 | {'0': {'precision': 0.9136231884057971, 'recall': 0.9899497487437185, 'f1-score': 0.9502562556526982, 'support': 1592.0}, '1': {'precision': 0.8740157480314961, 'recall': 0.4269230769230769, 'f1-score': 0.5736434108527132, 'support': 260.0}, 'accuracy': 0.9109071274298056, 'macro avg': {'precision': 0.8938194682186467, 'recall': 0.7084364128333978, 'f1-score': 0.7619498332527057, 'support': 1852.0}, 'weighted avg': {'precision': 0.9080627486124287, 'recall': 0.9109071274298056, 'f1-score': 0.8973840420198709, 'support': 1852.0}} |
53
+ | No log | 3.0 | 294 | 0.1986 | {'0': {'precision': 0.96248382923674, 'recall': 0.9346733668341709, 'f1-score': 0.9483747609942639, 'support': 1592.0}, '1': {'precision': 0.6601307189542484, 'recall': 0.7769230769230769, 'f1-score': 0.7137809187279152, 'support': 260.0}, 'accuracy': 0.9125269978401728, 'macro avg': {'precision': 0.8113072740954942, 'recall': 0.855798221878624, 'f1-score': 0.8310778398610895, 'support': 1852.0}, 'weighted avg': {'precision': 0.9200368483115523, 'recall': 0.9125269978401728, 'f1-score': 0.9154404202873251, 'support': 1852.0}} |
54
+ | No log | 4.0 | 392 | 0.1968 | {'0': {'precision': 0.9618863049095607, 'recall': 0.9353015075376885, 'f1-score': 0.9484076433121019, 'support': 1592.0}, '1': {'precision': 0.6611842105263158, 'recall': 0.7730769230769231, 'f1-score': 0.7127659574468085, 'support': 260.0}, 'accuracy': 0.9125269978401728, 'macro avg': {'precision': 0.8115352577179382, 'recall': 0.8541892153073058, 'f1-score': 0.8305868003794552, 'support': 1852.0}, 'weighted avg': {'precision': 0.9196711080739, 'recall': 0.9125269978401728, 'f1-score': 0.9153261971323092, 'support': 1852.0}} |
55
+ | No log | 5.0 | 490 | 0.1851 | {'0': {'precision': 0.9412477286493035, 'recall': 0.9761306532663316, 'f1-score': 0.9583718778908418, 'support': 1592.0}, '1': {'precision': 0.8109452736318408, 'recall': 0.6269230769230769, 'f1-score': 0.7071583514099783, 'support': 260.0}, 'accuracy': 0.9271058315334774, 'macro avg': {'precision': 0.8760965011405721, 'recall': 0.8015268650947043, 'f1-score': 0.83276511465041, 'support': 1852.0}, 'weighted avg': {'precision': 0.9229547274049513, 'recall': 0.9271058315334774, 'f1-score': 0.9231043201775456, 'support': 1852.0}} |
56
 
57
 
58
  ### Framework versions