ShyamVarahagiri commited on
Commit
44039fb
·
1 Parent(s): 2195500

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -13
README.md CHANGED
@@ -21,7 +21,7 @@ model-index:
21
  metrics:
22
  - name: Bleu
23
  type: bleu
24
- value: 0.9535
25
  ---
26
 
27
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -31,9 +31,9 @@ should probably proofread and complete it, then remove this comment. -->
31
 
32
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the opus100 dataset.
33
  It achieves the following results on the evaluation set:
34
- - Loss: 3.8884
35
- - Bleu: 0.9535
36
- - Gen Len: 22.708
37
 
38
  ## Model description
39
 
@@ -53,22 +53,22 @@ More information needed
53
 
54
  The following hyperparameters were used during training:
55
  - learning_rate: 0.0003
56
- - train_batch_size: 24
57
- - eval_batch_size: 24
58
  - seed: 42
59
- - gradient_accumulation_steps: 10
60
- - total_train_batch_size: 240
61
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
62
  - lr_scheduler_type: linear
63
  - num_epochs: 3
64
 
65
  ### Training results
66
 
67
- | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
68
- |:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
69
- | No log | 0.98 | 41 | 4.9620 | 0.1607 | 34.306 |
70
- | No log | 1.99 | 83 | 4.0854 | 0.5834 | 23.007 |
71
- | No log | 2.95 | 123 | 3.8884 | 0.9535 | 22.708 |
72
 
73
 
74
  ### Framework versions
 
21
  metrics:
22
  - name: Bleu
23
  type: bleu
24
+ value: 13.5859
25
  ---
26
 
27
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
31
 
32
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the opus100 dataset.
33
  It achieves the following results on the evaluation set:
34
+ - Loss: 2.2302
35
+ - Bleu: 13.5859
36
+ - Gen Len: 18.8405
37
 
38
  ## Model description
39
 
 
53
 
54
  The following hyperparameters were used during training:
55
  - learning_rate: 0.0003
56
+ - train_batch_size: 48
57
+ - eval_batch_size: 48
58
  - seed: 42
59
+ - gradient_accumulation_steps: 16
60
+ - total_train_batch_size: 768
61
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
62
  - lr_scheduler_type: linear
63
  - num_epochs: 3
64
 
65
  ### Training results
66
 
67
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
68
+ |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
69
+ | 4.1183 | 1.0 | 695 | 2.4708 | 10.3498 | 19.673 |
70
+ | 2.8109 | 2.0 | 1391 | 2.2799 | 12.738 | 18.8605 |
71
+ | 2.4839 | 3.0 | 2085 | 2.2302 | 13.5859 | 18.8405 |
72
 
73
 
74
  ### Framework versions