ShyamVarahagiri
/

MachineTranslation

text2text-generation

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

ShyamVarahagiri commited on Mar 25, 2023

Commit

44039fb

·

1 Parent(s): 2195500

update model card README.md

Files changed (1) hide show

README.md +13 -13

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ model-index:
     metrics:
     - name: Bleu
       type: bleu
-      value: 0.9535
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -31,9 +31,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the opus100 dataset.
 It achieves the following results on the evaluation set:
-- Loss: 3.8884
-- Bleu: 0.9535
-- Gen Len: 22.708
 ## Model description
@@ -53,22 +53,22 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
-- train_batch_size: 24
-- eval_batch_size: 24
 - seed: 42
-- gradient_accumulation_steps: 10
-- total_train_batch_size: 240
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 3
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Bleu   | Gen Len |
-|:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
-| No log        | 0.98  | 41   | 4.9620          | 0.1607 | 34.306  |
-| No log        | 1.99  | 83   | 4.0854          | 0.5834 | 23.007  |
-| No log        | 2.95  | 123  | 3.8884          | 0.9535 | 22.708  |
 ### Framework versions

     metrics:
     - name: Bleu
       type: bleu
+      value: 13.5859
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the opus100 dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.2302
+- Bleu: 13.5859
+- Gen Len: 18.8405
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
+- train_batch_size: 48
+- eval_batch_size: 48
 - seed: 42
+- gradient_accumulation_steps: 16
+- total_train_batch_size: 768
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 3
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Bleu    | Gen Len |
+|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
+| 4.1183        | 1.0   | 695  | 2.4708          | 10.3498 | 19.673  |
+| 2.8109        | 2.0   | 1391 | 2.2799          | 12.738  | 18.8605 |
+| 2.4839        | 3.0   | 2085 | 2.2302          | 13.5859 | 18.8405 |
 ### Framework versions