zera09 commited on
Commit
1621cd2
·
verified ·
1 Parent(s): a0e0edd

Model save

Browse files
Files changed (1) hide show
  1. README.md +18 -16
README.md CHANGED
@@ -6,6 +6,7 @@ tags:
6
  - generated_from_trainer
7
  metrics:
8
  - rouge
 
9
  model-index:
10
  - name: flan-context
11
  results: []
@@ -18,17 +19,13 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: nan
22
- - Rouge: {'rouge1': 0, 'rouge2': 0, 'rougeL': 0, 'rougeLsum': 0}
23
- - Bleu1: 0
24
- - Bleu2: 0
25
- - Bleu3: 0
26
- - Bleu4: 0
27
- - Meteor: 0
28
- - Bertscore Precision: 0
29
- - Bertscore Recall: 0
30
- - Bertscore F1: 0
31
- - Gen Len: 0
32
 
33
  ## Model description
34
 
@@ -51,16 +48,21 @@ The following hyperparameters were used during training:
51
  - train_batch_size: 4
52
  - eval_batch_size: 4
53
  - seed: 42
 
 
54
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
55
  - lr_scheduler_type: linear
56
- - num_epochs: 2
57
 
58
  ### Training results
59
 
60
- | Training Loss | Epoch | Step | Validation Loss | Rouge | Bleu1 | Bleu2 | Bleu3 | Bleu4 | Meteor | Bertscore Precision | Bertscore Recall | Bertscore F1 | Gen Len |
61
- |:-------------:|:-----:|:----:|:---------------:|:-------------------------------------------------------:|:-----:|:-----:|:-----:|:-----:|:------:|:-------------------:|:----------------:|:------------:|:-------:|
62
- | 1.83 | 1.0 | 378 | nan | {'rouge1': 0, 'rouge2': 0, 'rougeL': 0, 'rougeLsum': 0} | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
63
- | 0.0 | 2.0 | 756 | nan | {'rouge1': 0, 'rouge2': 0, 'rougeL': 0, 'rougeLsum': 0} | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
 
 
 
64
 
65
 
66
  ### Framework versions
 
6
  - generated_from_trainer
7
  metrics:
8
  - rouge
9
+ - bleu
10
  model-index:
11
  - name: flan-context
12
  results: []
 
19
 
20
  This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the None dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 2.3842
23
+ - Rouge: {'rouge1': 0.23405466769402272, 'rouge2': 0.07972272627141436, 'rougeL': 0.19916876363600983, 'rougeLsum': 0.19933848643732807}
24
+ - Bleu: {'bleu': 0.02684649083752067, 'precisions': [0.36613819922225543, 0.11508951406649616, 0.051836594576038446, 0.02742772424017791], 'brevity_penalty': 0.3051481683964344, 'length_ratio': 0.4572561893037888, 'translation_length': 3343, 'reference_length': 7311}
25
+ - Bertscore Precision: 0.8858
26
+ - Bertscore Recall: 0.8612
27
+ - Bertscore F1: 0.8732
28
+ - Meteor: 0.1631
 
 
 
 
29
 
30
  ## Model description
31
 
 
48
  - train_batch_size: 4
49
  - eval_batch_size: 4
50
  - seed: 42
51
+ - gradient_accumulation_steps: 2
52
+ - total_train_batch_size: 8
53
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
54
  - lr_scheduler_type: linear
55
+ - num_epochs: 5
56
 
57
  ### Training results
58
 
59
+ | Training Loss | Epoch | Step | Validation Loss | Rouge | Bleu | Bertscore Precision | Bertscore Recall | Bertscore F1 | Meteor |
60
+ |:-------------:|:------:|:----:|:---------------:|:-------------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------:|:-------------------:|:----------------:|:------------:|:------:|
61
+ | 2.9048 | 0.9973 | 188 | 2.4790 | {'rouge1': 0.20968553648391508, 'rouge2': 0.07443896925573865, 'rougeL': 0.1805792844016589, 'rougeLsum': 0.18088539253560337} | {'bleu': 0.020489842222462733, 'precisions': [0.38257575757575757, 0.11974711788769059, 0.05699272433306386, 0.029216467463479414], 'brevity_penalty': 0.21924576067972068, 'length_ratio': 0.3972096840377513, 'translation_length': 2904, 'reference_length': 7311} | 0.8919 | 0.8585 | 0.8747 | 0.1489 |
62
+ | 2.5945 | 2.0 | 377 | 2.4208 | {'rouge1': 0.2297957739195002, 'rouge2': 0.07836393682592077, 'rougeL': 0.19574285931586738, 'rougeLsum': 0.19611644340511103} | {'bleu': 0.0253671186540934, 'precisions': [0.381875, 0.11792294807370184, 0.054873646209386284, 0.02857142857142857], 'brevity_penalty': 0.2767370504490949, 'length_ratio': 0.4376966215292026, 'translation_length': 3200, 'reference_length': 7311} | 0.8902 | 0.8617 | 0.8756 | 0.1597 |
63
+ | 2.4452 | 2.9973 | 565 | 2.3912 | {'rouge1': 0.22841476315147372, 'rouge2': 0.07651572809303378, 'rougeL': 0.19569000397215158, 'rougeLsum': 0.195641894488224} | {'bleu': 0.026971694358366844, 'precisions': [0.36859553948161544, 0.11666129552046407, 0.05332409972299169, 0.02843247287691732], 'brevity_penalty': 0.30016114315032577, 'length_ratio': 0.45383668444809194, 'translation_length': 3318, 'reference_length': 7311} | 0.8862 | 0.8603 | 0.8729 | 0.1620 |
64
+ | 2.3204 | 4.0 | 754 | 2.3840 | {'rouge1': 0.2324717619677979, 'rouge2': 0.07856325718853738, 'rougeL': 0.19781179401099636, 'rougeLsum': 0.1979068896295354} | {'bleu': 0.025569078512862678, 'precisions': [0.3670581150255947, 0.11332904056664521, 0.050155655482531994, 0.025037369207772796], 'brevity_penalty': 0.3007591959750055, 'length_ratio': 0.4542470250307755, 'translation_length': 3321, 'reference_length': 7311} | 0.8857 | 0.8609 | 0.8730 | 0.1624 |
65
+ | 2.2574 | 4.9867 | 940 | 2.3842 | {'rouge1': 0.23405466769402272, 'rouge2': 0.07972272627141436, 'rougeL': 0.19916876363600983, 'rougeLsum': 0.19933848643732807} | {'bleu': 0.02684649083752067, 'precisions': [0.36613819922225543, 0.11508951406649616, 0.051836594576038446, 0.02742772424017791], 'brevity_penalty': 0.3051481683964344, 'length_ratio': 0.4572561893037888, 'translation_length': 3343, 'reference_length': 7311} | 0.8858 | 0.8612 | 0.8732 | 0.1631 |
66
 
67
 
68
  ### Framework versions