jrc-ai
/

PreDA-small

@@ -18,11 +18,11 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.6512
-- Rouge1: 0.8163
-- Rouge2: 0.7077
-- Rougel: 0.7900
-- Rougelsum: 0.7903
 ## Model description
@@ -47,6 +47,7 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 20
 - mixed_precision_training: Native AMP
 - label_smoothing_factor: 0.1
@@ -55,26 +56,26 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
-| 2.3316        | 1.0   | 34   | 2.1243          | 0.3754 | 0.1943 | 0.3628 | 0.3621    |
-| 2.0677        | 2.0   | 68   | 1.9830          | 0.5392 | 0.3684 | 0.5237 | 0.5239    |
-| 1.9653        | 3.0   | 102  | 1.8945          | 0.5878 | 0.4130 | 0.5664 | 0.5665    |
-| 1.9375        | 4.0   | 136  | 1.8565          | 0.5939 | 0.4252 | 0.5777 | 0.5769    |
-| 1.8925        | 5.0   | 170  | 1.8242          | 0.6380 | 0.4816 | 0.6180 | 0.6178    |
-| 1.8593        | 6.0   | 204  | 1.7982          | 0.6707 | 0.5139 | 0.6483 | 0.6487    |
-| 1.8316        | 7.0   | 238  | 1.7788          | 0.6816 | 0.5326 | 0.6561 | 0.6564    |
-| 1.8048        | 8.0   | 272  | 1.7564          | 0.7278 | 0.5871 | 0.6991 | 0.6993    |
-| 1.8007        | 9.0   | 306  | 1.7399          | 0.7245 | 0.5854 | 0.6971 | 0.6973    |
-| 1.7689        | 10.0  | 340  | 1.7253          | 0.7558 | 0.6249 | 0.7272 | 0.7274    |
-| 1.7591        | 11.0  | 374  | 1.7149          | 0.7539 | 0.6255 | 0.7284 | 0.7285    |
-| 1.7484        | 12.0  | 408  | 1.7009          | 0.7683 | 0.6439 | 0.7419 | 0.7423    |
-| 1.7337        | 13.0  | 442  | 1.6883          | 0.7829 | 0.6612 | 0.7569 | 0.7572    |
-| 1.7278        | 14.0  | 476  | 1.6819          | 0.7786 | 0.6592 | 0.7551 | 0.7553    |
-| 1.7154        | 15.0  | 510  | 1.6750          | 0.7958 | 0.6804 | 0.7692 | 0.7696    |
-| 1.705         | 16.0  | 544  | 1.6665          | 0.7970 | 0.6821 | 0.7730 | 0.7730    |
-| 1.6961        | 17.0  | 578  | 1.6609          | 0.8030 | 0.6921 | 0.7773 | 0.7777    |
-| 1.6955        | 18.0  | 612  | 1.6565          | 0.8061 | 0.6956 | 0.7816 | 0.7819    |
-| 1.688         | 19.0  | 646  | 1.6533          | 0.8138 | 0.7045 | 0.7879 | 0.7882    |
-| 1.6848        | 20.0  | 680  | 1.6512          | 0.8163 | 0.7077 | 0.7900 | 0.7903    |
 ### Framework versions

 This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.6467
+- Rouge1: 0.8305
+- Rouge2: 0.7257
+- Rougel: 0.8080
+- Rougelsum: 0.8080
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 10
 - num_epochs: 20
 - mixed_precision_training: Native AMP
 - label_smoothing_factor: 0.1
 | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|
+| 2.5334        | 1.0   | 34   | 2.2047          | 0.3655 | 0.1949 | 0.3570 | 0.3570    |
+| 2.0989        | 2.0   | 68   | 2.0026          | 0.5321 | 0.3606 | 0.5169 | 0.5168    |
+| 1.9755        | 3.0   | 102  | 1.9020          | 0.5873 | 0.4139 | 0.5639 | 0.5646    |
+| 1.9445        | 4.0   | 136  | 1.8645          | 0.5968 | 0.4271 | 0.5800 | 0.5805    |
+| 1.8995        | 5.0   | 170  | 1.8282          | 0.6438 | 0.4882 | 0.6216 | 0.6216    |
+| 1.8616        | 6.0   | 204  | 1.7978          | 0.6675 | 0.5107 | 0.6473 | 0.6473    |
+| 1.8316        | 7.0   | 238  | 1.7784          | 0.6890 | 0.5369 | 0.6638 | 0.6636    |
+| 1.8049        | 8.0   | 272  | 1.7542          | 0.7191 | 0.5761 | 0.6934 | 0.6937    |
+| 1.7977        | 9.0   | 306  | 1.7373          | 0.7322 | 0.5953 | 0.7049 | 0.7052    |
+| 1.7642        | 10.0  | 340  | 1.7219          | 0.7545 | 0.6213 | 0.7248 | 0.7252    |
+| 1.7562        | 11.0  | 374  | 1.7072          | 0.7664 | 0.6389 | 0.7418 | 0.7423    |
+| 1.7437        | 12.0  | 408  | 1.6961          | 0.7777 | 0.6519 | 0.7494 | 0.7496    |
+| 1.7271        | 13.0  | 442  | 1.6838          | 0.7893 | 0.6715 | 0.7636 | 0.7638    |
+| 1.7238        | 14.0  | 476  | 1.6765          | 0.7946 | 0.6759 | 0.7701 | 0.7703    |
+| 1.7151        | 15.0  | 510  | 1.6706          | 0.8065 | 0.6918 | 0.7830 | 0.7833    |
+| 1.6997        | 16.0  | 544  | 1.6605          | 0.8143 | 0.7006 | 0.7889 | 0.7892    |
+| 1.6937        | 17.0  | 578  | 1.6552          | 0.8202 | 0.7100 | 0.7965 | 0.7968    |
+| 1.6919        | 18.0  | 612  | 1.6505          | 0.8238 | 0.7176 | 0.8019 | 0.8019    |
+| 1.6826        | 19.0  | 646  | 1.6493          | 0.8262 | 0.7210 | 0.8039 | 0.8040    |
+| 1.6811        | 20.0  | 680  | 1.6467          | 0.8305 | 0.7257 | 0.8080 | 0.8080    |
 ### Framework versions

tokenizer.json CHANGED Viewed

@@ -2,20 +2,11 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 128,
     "strategy": "LongestFirst",
     "stride": 0
   },
-  "padding": {
-    "strategy": {
-      "Fixed": 128
-    },
-    "direction": "Right",
-    "pad_to_multiple_of": null,
-    "pad_id": 0,
-    "pad_type_id": 0,
-    "pad_token": "<pad>"
-  },
   "added_tokens": [
     {
       "id": 0,

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 512,
     "strategy": "LongestFirst",
     "stride": 0
   },
+  "padding": null,
   "added_tokens": [
     {
       "id": 0,