ShyamVarahagiri commited on
Commit
e4a1edb
·
1 Parent(s): dd02a8f

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +12 -15
README.md CHANGED
@@ -21,7 +21,7 @@ model-index:
21
  metrics:
22
  - name: Bleu
23
  type: bleu
24
- value: 0.0046
25
  ---
26
 
27
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -31,9 +31,9 @@ should probably proofread and complete it, then remove this comment. -->
31
 
32
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the opus100 dataset.
33
  It achieves the following results on the evaluation set:
34
- - Loss: nan
35
- - Bleu: 0.0046
36
- - Gen Len: 2.7475
37
 
38
  ## Model description
39
 
@@ -53,25 +53,22 @@ More information needed
53
 
54
  The following hyperparameters were used during training:
55
  - learning_rate: 0.0003
56
- - train_batch_size: 8
57
- - eval_batch_size: 8
58
  - seed: 42
59
- - gradient_accumulation_steps: 32
60
- - total_train_batch_size: 256
61
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
62
  - lr_scheduler_type: linear
63
- - num_epochs: 5
64
- - mixed_precision_training: Native AMP
65
 
66
  ### Training results
67
 
68
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
69
  |:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
70
- | No log | 1.0 | 39 | nan | 0.0046 | 2.7475 |
71
- | No log | 2.0 | 78 | nan | 0.0046 | 2.7475 |
72
- | No log | 3.0 | 117 | nan | 0.0046 | 2.7475 |
73
- | No log | 3.99 | 156 | nan | 0.0046 | 2.7475 |
74
- | No log | 4.99 | 195 | nan | 0.0046 | 2.7475 |
75
 
76
 
77
  ### Framework versions
 
21
  metrics:
22
  - name: Bleu
23
  type: bleu
24
+ value: 0.9535
25
  ---
26
 
27
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
31
 
32
  This model is a fine-tuned version of [google/mt5-small](https://huggingface.co/google/mt5-small) on the opus100 dataset.
33
  It achieves the following results on the evaluation set:
34
+ - Loss: 3.8884
35
+ - Bleu: 0.9535
36
+ - Gen Len: 22.708
37
 
38
  ## Model description
39
 
 
53
 
54
  The following hyperparameters were used during training:
55
  - learning_rate: 0.0003
56
+ - train_batch_size: 24
57
+ - eval_batch_size: 24
58
  - seed: 42
59
+ - gradient_accumulation_steps: 10
60
+ - total_train_batch_size: 240
61
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
62
  - lr_scheduler_type: linear
63
+ - num_epochs: 3
 
64
 
65
  ### Training results
66
 
67
  | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
68
  |:-------------:|:-----:|:----:|:---------------:|:------:|:-------:|
69
+ | No log | 0.98 | 41 | 4.9620 | 0.1607 | 34.306 |
70
+ | No log | 1.99 | 83 | 4.0854 | 0.5834 | 23.007 |
71
+ | No log | 2.95 | 123 | 3.8884 | 0.9535 | 22.708 |
 
 
72
 
73
 
74
  ### Framework versions