ShyamVarahagiri
/

MachineTranslation

text2text-generation

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

ShyamVarahagiri commited on Mar 25, 2023

Commit

e4a1edb

·

1 Parent(s): dd02a8f

update model card README.md

Files changed (1) hide show

README.md +12 -15

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ model-index:
     metrics:
     - name: Bleu
       type: bleu
-      value: 0.0046
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -31,9 +31,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the opus100 dataset.
 It achieves the following results on the evaluation set:
-- Loss: nan
-- Bleu: 0.0046
-- Gen Len: 2.7475
 ## Model description
@@ -53,25 +53,22 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
-- train_batch_size: 8
-- eval_batch_size: 8
 - seed: 42
-- gradient_accumulation_steps: 32
-- total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- num_epochs: 5
-- mixed_precision_training: Native AMP
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Bleu   | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
-| No log        | 1.0   | 39   | nan             | 0.0046 | 2.7475  |
-| No log        | 2.0   | 78   | nan             | 0.0046 | 2.7475  |
-| No log        | 3.0   | 117  | nan             | 0.0046 | 2.7475  |
-| No log        | 3.99  | 156  | nan             | 0.0046 | 2.7475  |
-| No log        | 4.99  | 195  | nan             | 0.0046 | 2.7475  |
 ### Framework versions

     metrics:
     - name: Bleu
       type: bleu
+      value: 0.9535
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the opus100 dataset.
 It achieves the following results on the evaluation set:
+- Loss: 3.8884
+- Bleu: 0.9535
+- Gen Len: 22.708
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0003
+- train_batch_size: 24
+- eval_batch_size: 24
 - seed: 42
+- gradient_accumulation_steps: 10
+- total_train_batch_size: 240
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- num_epochs: 3
 ### Training results
 | Training Loss | Epoch | Step | Validation Loss | Bleu   | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
+| No log        | 0.98  | 41   | 4.9620          | 0.1607 | 34.306  |
+| No log        | 1.99  | 83   | 4.0854          | 0.5834 | 23.007  |
+| No log        | 2.95  | 123  | 3.8884          | 0.9535 | 22.708  |
 ### Framework versions