Ahmed235
/

summarize

@@ -3,8 +3,6 @@ license: apache-2.0
 base_model: google-t5/t5-small
 tags:
 - generated_from_trainer
-metrics:
-- rouge
 model-index:
 - name: summarize
   results: []
@@ -17,12 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.7414
-- Rouge1: 0.1691
-- Rouge2: 0.0572
-- Rougel: 0.1342
-- Rougelsum: 0.1342
-- Gen Len: 19.0
 ## Model description
@@ -47,18 +42,23 @@ The following hyperparameters were used during training:
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 5
 - mixed_precision_training: Native AMP
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
-|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
-| 3.1704        | 1.0   | 500  | 2.8278          | 0.1619 | 0.0527 | 0.1283 | 0.1283    | 19.0    |
-| 2.9742        | 2.0   | 1000 | 2.7769          | 0.1666 | 0.0553 | 0.1322 | 0.1322    | 19.0    |
-| 2.9285        | 3.0   | 1500 | 2.7561          | 0.1674 | 0.0562 | 0.1326 | 0.1326    | 19.0    |
-| 2.903         | 4.0   | 2000 | 2.7452          | 0.1679 | 0.0562 | 0.1329 | 0.1329    | 19.0    |
-| 2.8917        | 5.0   | 2500 | 2.7414          | 0.1691 | 0.0572 | 0.1342 | 0.1342    | 19.0    |
 ### Framework versions

 base_model: google-t5/t5-small
 tags:
 - generated_from_trainer
 model-index:
 - name: summarize
   results: []
 This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.6935
+- Evaluation: {'evaluation_runtime': 28.518348455429077, 'samples_per_second': 33.3118869588378, 'steps_per_second': 33.3118869588378}
+- Rounded Rouge: {'rouge1': 0.1705, 'rouge2': 0.0588, 'rougeL': 0.1354, 'rougeLsum': 0.1355}
 ## Model description
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 10
 - mixed_precision_training: Native AMP
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Evaluation                                                                                                                   | Rounded Rouge                                                               |
+|:-------------:|:-----:|:----:|:---------------:|:----------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------:|
+| 3.1701        | 1.0   | 500  | 2.8229          | {'evaluation_runtime': 30.270989179611206, 'samples_per_second': 31.383183230756966, 'steps_per_second': 31.383183230756966} | {'rouge1': 0.1615, 'rouge2': 0.0525, 'rougeL': 0.128, 'rougeLsum': 0.1281}  |
+| 2.9661        | 2.0   | 1000 | 2.7672          | {'evaluation_runtime': 28.879830598831177, 'samples_per_second': 32.894929793613414, 'steps_per_second': 32.894929793613414} | {'rouge1': 0.1676, 'rouge2': 0.0567, 'rougeL': 0.1326, 'rougeLsum': 0.1327} |
+| 2.9128        | 3.0   | 1500 | 2.7414          | {'evaluation_runtime': 28.787310361862183, 'samples_per_second': 33.00065160858421, 'steps_per_second': 33.00065160858421}   | {'rouge1': 0.1693, 'rouge2': 0.0575, 'rougeL': 0.1342, 'rougeLsum': 0.1343} |
+| 2.8783        | 4.0   | 2000 | 2.7240          | {'evaluation_runtime': 28.755173683166504, 'samples_per_second': 33.03753301814126, 'steps_per_second': 33.03753301814126}   | {'rouge1': 0.1694, 'rouge2': 0.0581, 'rougeL': 0.1343, 'rougeLsum': 0.1344} |
+| 2.8548        | 5.0   | 2500 | 2.7137          | {'evaluation_runtime': 30.050004959106445, 'samples_per_second': 31.613971488284534, 'steps_per_second': 31.613971488284534} | {'rouge1': 0.171, 'rouge2': 0.0591, 'rougeL': 0.1354, 'rougeLsum': 0.1354}  |
+| 2.8353        | 6.0   | 3000 | 2.7047          | {'evaluation_runtime': 29.376569986343384, 'samples_per_second': 32.33869714679546, 'steps_per_second': 32.33869714679546}   | {'rouge1': 0.1703, 'rouge2': 0.0587, 'rougeL': 0.135, 'rougeLsum': 0.135}   |
+| 2.8229        | 7.0   | 3500 | 2.6996          | {'evaluation_runtime': 27.381307363510132, 'samples_per_second': 34.69520236517353, 'steps_per_second': 34.69520236517353}   | {'rouge1': 0.1714, 'rouge2': 0.0592, 'rougeL': 0.1357, 'rougeLsum': 0.1357} |
+| 2.8154        | 8.0   | 4000 | 2.6958          | {'evaluation_runtime': 27.409220457077026, 'samples_per_second': 34.65986934899169, 'steps_per_second': 34.65986934899169}   | {'rouge1': 0.17, 'rouge2': 0.0587, 'rougeL': 0.1351, 'rougeLsum': 0.1352}   |
+| 2.8068        | 9.0   | 4500 | 2.6943          | {'evaluation_runtime': 27.376741409301758, 'samples_per_second': 34.7009889086807, 'steps_per_second': 34.7009889086807}     | {'rouge1': 0.1702, 'rouge2': 0.0588, 'rougeL': 0.1352, 'rougeLsum': 0.1353} |
+| 2.8           | 10.0  | 5000 | 2.6935          | {'evaluation_runtime': 28.518348455429077, 'samples_per_second': 33.3118869588378, 'steps_per_second': 33.3118869588378}     | {'rouge1': 0.1705, 'rouge2': 0.0588, 'rougeL': 0.1354, 'rougeLsum': 0.1355} |
 ### Framework versions