maximuspowers
/

bert-philosophy-classifier

@@ -15,6 +15,16 @@ should probably proofread and complete it, then remove this comment. -->
 # bert-philosophy-classifier
 This model is a fine-tuned version of [maximuspowers/bert-philosophy-adapted](https://huggingface.co/maximuspowers/bert-philosophy-adapted) on the None dataset.
 ## Model description
@@ -42,11 +52,15 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 100
-- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
 ### Framework versions

 # bert-philosophy-classifier
 This model is a fine-tuned version of [maximuspowers/bert-philosophy-adapted](https://huggingface.co/maximuspowers/bert-philosophy-adapted) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.7200
+- Exact Match Accuracy: 0.2
+- Macro Precision: 0.1583
+- Macro Recall: 0.0909
+- Macro F1: 0.1152
+- Micro Precision: 0.8571
+- Micro Recall: 0.2105
+- Micro F1: 0.3380
+- Hamming Loss: 0.0691
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - lr_scheduler_warmup_steps: 100
+- num_epochs: 50
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Exact Match Accuracy | Macro Precision | Macro Recall | Macro F1 | Micro Precision | Micro Recall | Micro F1 | Hamming Loss |
+|:-------------:|:-----:|:----:|:---------------:|:--------------------:|:---------------:|:------------:|:--------:|:---------------:|:------------:|:--------:|:------------:|
+| 0.811         | 25.0  | 250  | 0.7701          | 0.1                  | 0.1092          | 0.0615       | 0.0784   | 0.875           | 0.1228       | 0.2154   | 0.075        |
+| 0.58          | 50.0  | 500  | 0.7200          | 0.2                  | 0.1583          | 0.0909       | 0.1152   | 0.8571          | 0.2105       | 0.3380   | 0.0691       |
 ### Framework versions

all_results.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-    "epoch": 5.0,
     "eval_exact_match_accuracy": 0.2,
     "eval_hamming_loss": 0.075,
     "eval_loss": 0.8420153856277466,
@@ -13,8 +13,8 @@
     "eval_samples_per_second": 180.125,
     "eval_steps_per_second": 13.509,
     "total_flos": 0.0,
-    "train_loss": 1.6828109741210937,
-    "train_runtime": 29.5351,
-    "train_samples_per_second": 53.496,
-    "train_steps_per_second": 1.693
 }

 {
+    "epoch": 50.0,
     "eval_exact_match_accuracy": 0.2,
     "eval_hamming_loss": 0.075,
     "eval_loss": 0.8420153856277466,
     "eval_samples_per_second": 180.125,
     "eval_steps_per_second": 13.509,
     "total_flos": 0.0,
+    "train_loss": 1.1355848159790038,
+    "train_runtime": 246.5817,
+    "train_samples_per_second": 64.076,
+    "train_steps_per_second": 2.028
 }

train_results.json CHANGED Viewed

@@ -1,8 +1,8 @@
 {
-    "epoch": 5.0,
     "total_flos": 0.0,
-    "train_loss": 1.6828109741210937,
-    "train_runtime": 29.5351,
-    "train_samples_per_second": 53.496,
-    "train_steps_per_second": 1.693
 }

 {
+    "epoch": 50.0,
     "total_flos": 0.0,
+    "train_loss": 1.1355848159790038,
+    "train_runtime": 246.5817,
+    "train_samples_per_second": 64.076,
+    "train_steps_per_second": 2.028
 }