End of training
Browse files
README.md
CHANGED
|
@@ -16,8 +16,8 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
|
| 18 |
It achieves the following results on the evaluation set:
|
| 19 |
-
- Loss: 0.
|
| 20 |
-
- Classification Report: {'0': {'precision': 0.
|
| 21 |
|
| 22 |
## Model description
|
| 23 |
|
|
@@ -37,24 +37,22 @@ More information needed
|
|
| 37 |
|
| 38 |
The following hyperparameters were used during training:
|
| 39 |
- learning_rate: 5e-06
|
| 40 |
-
- train_batch_size:
|
| 41 |
-
- eval_batch_size:
|
| 42 |
- seed: 42
|
| 43 |
-
- distributed_type: multi-GPU
|
| 44 |
-
- num_devices: 2
|
| 45 |
-
- total_train_batch_size: 256
|
| 46 |
-
- total_eval_batch_size: 256
|
| 47 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 48 |
- lr_scheduler_type: linear
|
| 49 |
-
- num_epochs:
|
| 50 |
|
| 51 |
### Training results
|
| 52 |
|
| 53 |
-
| Training Loss | Epoch | Step | Validation Loss | Classification Report
|
| 54 |
-
|
| 55 |
-
| No log | 1.0 |
|
| 56 |
-
| No log | 2.0 |
|
| 57 |
-
| No log | 3.0 |
|
|
|
|
|
|
|
| 58 |
|
| 59 |
|
| 60 |
### Framework versions
|
|
|
|
| 16 |
|
| 17 |
This model is a fine-tuned version of [answerdotai/ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) on an unknown dataset.
|
| 18 |
It achieves the following results on the evaluation set:
|
| 19 |
+
- Loss: 0.1851
|
| 20 |
+
- Classification Report: {'0': {'precision': 0.9412477286493035, 'recall': 0.9761306532663316, 'f1-score': 0.9583718778908418, 'support': 1592.0}, '1': {'precision': 0.8109452736318408, 'recall': 0.6269230769230769, 'f1-score': 0.7071583514099783, 'support': 260.0}, 'accuracy': 0.9271058315334774, 'macro avg': {'precision': 0.8760965011405721, 'recall': 0.8015268650947043, 'f1-score': 0.83276511465041, 'support': 1852.0}, 'weighted avg': {'precision': 0.9229547274049513, 'recall': 0.9271058315334774, 'f1-score': 0.9231043201775456, 'support': 1852.0}}
|
| 21 |
|
| 22 |
## Model description
|
| 23 |
|
|
|
|
| 37 |
|
| 38 |
The following hyperparameters were used during training:
|
| 39 |
- learning_rate: 5e-06
|
| 40 |
+
- train_batch_size: 64
|
| 41 |
+
- eval_batch_size: 64
|
| 42 |
- seed: 42
|
|
|
|
|
|
|
|
|
|
|
|
|
| 43 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 44 |
- lr_scheduler_type: linear
|
| 45 |
+
- num_epochs: 5
|
| 46 |
|
| 47 |
### Training results
|
| 48 |
|
| 49 |
+
| Training Loss | Epoch | Step | Validation Loss | Classification Report |
|
| 50 |
+
|:-------------:|:-----:|:----:|:---------------:|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|
|
| 51 |
+
| No log | 1.0 | 98 | 0.2211 | {'0': {'precision': 0.9382716049382716, 'recall': 0.9547738693467337, 'f1-score': 0.9464508094645081, 'support': 1592.0}, '1': {'precision': 0.6896551724137931, 'recall': 0.6153846153846154, 'f1-score': 0.6504065040650406, 'support': 260.0}, 'accuracy': 0.9071274298056156, 'macro avg': {'precision': 0.8139633886760324, 'recall': 0.7850792423656745, 'f1-score': 0.7984286567647744, 'support': 1852.0}, 'weighted avg': {'precision': 0.9033686500482261, 'recall': 0.9071274298056156, 'f1-score': 0.9048895138900688, 'support': 1852.0}} |
|
| 52 |
+
| No log | 2.0 | 196 | 0.2076 | {'0': {'precision': 0.9136231884057971, 'recall': 0.9899497487437185, 'f1-score': 0.9502562556526982, 'support': 1592.0}, '1': {'precision': 0.8740157480314961, 'recall': 0.4269230769230769, 'f1-score': 0.5736434108527132, 'support': 260.0}, 'accuracy': 0.9109071274298056, 'macro avg': {'precision': 0.8938194682186467, 'recall': 0.7084364128333978, 'f1-score': 0.7619498332527057, 'support': 1852.0}, 'weighted avg': {'precision': 0.9080627486124287, 'recall': 0.9109071274298056, 'f1-score': 0.8973840420198709, 'support': 1852.0}} |
|
| 53 |
+
| No log | 3.0 | 294 | 0.1986 | {'0': {'precision': 0.96248382923674, 'recall': 0.9346733668341709, 'f1-score': 0.9483747609942639, 'support': 1592.0}, '1': {'precision': 0.6601307189542484, 'recall': 0.7769230769230769, 'f1-score': 0.7137809187279152, 'support': 260.0}, 'accuracy': 0.9125269978401728, 'macro avg': {'precision': 0.8113072740954942, 'recall': 0.855798221878624, 'f1-score': 0.8310778398610895, 'support': 1852.0}, 'weighted avg': {'precision': 0.9200368483115523, 'recall': 0.9125269978401728, 'f1-score': 0.9154404202873251, 'support': 1852.0}} |
|
| 54 |
+
| No log | 4.0 | 392 | 0.1968 | {'0': {'precision': 0.9618863049095607, 'recall': 0.9353015075376885, 'f1-score': 0.9484076433121019, 'support': 1592.0}, '1': {'precision': 0.6611842105263158, 'recall': 0.7730769230769231, 'f1-score': 0.7127659574468085, 'support': 260.0}, 'accuracy': 0.9125269978401728, 'macro avg': {'precision': 0.8115352577179382, 'recall': 0.8541892153073058, 'f1-score': 0.8305868003794552, 'support': 1852.0}, 'weighted avg': {'precision': 0.9196711080739, 'recall': 0.9125269978401728, 'f1-score': 0.9153261971323092, 'support': 1852.0}} |
|
| 55 |
+
| No log | 5.0 | 490 | 0.1851 | {'0': {'precision': 0.9412477286493035, 'recall': 0.9761306532663316, 'f1-score': 0.9583718778908418, 'support': 1592.0}, '1': {'precision': 0.8109452736318408, 'recall': 0.6269230769230769, 'f1-score': 0.7071583514099783, 'support': 260.0}, 'accuracy': 0.9271058315334774, 'macro avg': {'precision': 0.8760965011405721, 'recall': 0.8015268650947043, 'f1-score': 0.83276511465041, 'support': 1852.0}, 'weighted avg': {'precision': 0.9229547274049513, 'recall': 0.9271058315334774, 'f1-score': 0.9231043201775456, 'support': 1852.0}} |
|
| 56 |
|
| 57 |
|
| 58 |
### Framework versions
|