maud-dr
/

baseline_1-seed_42

@@ -20,10 +20,10 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: nan
-- Precision: 0.0
-- Recall: 0.0
-- F1: 0.0
 ## Model description
@@ -43,20 +43,32 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
-- num_epochs: 2
-- mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1  |
-|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:---:|
-| 0.0           | 1.0   | 224  | nan             | 0.0       | 0.0    | 0.0 |
-| 0.0           | 2.0   | 448  | nan             | 0.0       | 0.0    | 0.0 |
 ### Framework versions

 This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.8181
+- Precision: 0.5895
+- Recall: 0.6920
+- F1: 0.6367
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
+- num_epochs: 15
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1     |
+|:-------------:|:-----:|:----:|:---------------:|:---------:|:------:|:------:|
+| 0.6793        | 1.0   | 447  | 0.6713          | 0.5639    | 0.7355 | 0.6384 |
+| 0.6342        | 2.0   | 894  | 0.6474          | 0.5714    | 0.7971 | 0.6657 |
+| 0.5915        | 3.0   | 1341 | 0.7200          | 0.5845    | 0.7391 | 0.6528 |
+| 0.551         | 4.0   | 1788 | 0.7096          | 0.5890    | 0.8152 | 0.6839 |
+| 0.4821        | 5.0   | 2235 | 0.8309          | 0.5695    | 0.7717 | 0.6554 |
+| 0.4188        | 6.0   | 2682 | 0.8327          | 0.6174    | 0.6957 | 0.6542 |
+| 0.3621        | 7.0   | 3129 | 1.1130          | 0.5914    | 0.5978 | 0.5946 |
+| 0.3069        | 8.0   | 3576 | 1.4566          | 0.5828    | 0.6884 | 0.6312 |
+| 0.2551        | 9.0   | 4023 | 1.8830          | 0.5848    | 0.6993 | 0.6370 |
+| 0.2014        | 10.0  | 4470 | 2.3017          | 0.5855    | 0.6449 | 0.6138 |
+| 0.1637        | 11.0  | 4917 | 2.6628          | 0.5796    | 0.6993 | 0.6338 |
+| 0.1029        | 12.0  | 5364 | 3.0285          | 0.5852    | 0.6594 | 0.6201 |
+| 0.124         | 13.0  | 5811 | 3.3504          | 0.6055    | 0.6341 | 0.6195 |
+| 0.0737        | 14.0  | 6258 | 3.6600          | 0.6013    | 0.6667 | 0.6323 |
+| 0.0508        | 15.0  | 6705 | 3.8181          | 0.5895    | 0.6920 | 0.6367 |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9c84aab1cfc3a8f46af5178dd8131f1748a08abea3df334b8c9b081533d50ff3
 size 894020048

 version https://git-lfs.github.com/spec/v1
+oid sha256:0164c81fa40de7d8c838351f6f8e3e30d0b0e02c11b3938d517436615c976226
 size 894020048

tokenizer.json CHANGED Viewed

@@ -6,14 +6,7 @@
     "strategy": "LongestFirst",
     "stride": 0
   },
-  "padding": {
-    "strategy": "BatchLongest",
-    "direction": "Right",
-    "pad_to_multiple_of": null,
-    "pad_id": 0,
-    "pad_type_id": 0,
-    "pad_token": "<pad>"
-  },
   "added_tokens": [
     {
       "id": 0,

     "strategy": "LongestFirst",
     "stride": 0
   },
+  "padding": null,
   "added_tokens": [
     {
       "id": 0,