floflodebilbao commited on
Commit
3b57b06
·
verified ·
1 Parent(s): 276994e

End of training

Browse files
README.md CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
- - Loss: 12.3467
26
- - Rouge1: 0.2224
27
- - Rouge2: 0.0751
28
- - Rougel: 0.1809
29
- - Rougelsum: 0.183
30
  - Gen Len: 20.0
31
- - Bleu: 0.0399
32
- - Precisions: 0.0959
33
- - Brevity Penalty: 0.5487
34
- - Length Ratio: 0.6249
35
- - Translation Length: 763.0
36
  - Reference Length: 1221.0
37
- - Precision: 0.8649
38
- - Recall: 0.8546
39
- - F1: 0.8596
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
@@ -64,22 +64,27 @@ The following hyperparameters were used during training:
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
- - num_epochs: 10
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
- | No log | 1.0 | 7 | 24.5371 | 0.1902 | 0.0515 | 0.1592 | 0.1597 | 20.0 | 0.0222 | 0.0727 | 0.5244 | 0.6077 | 742.0 | 1221.0 | 0.8574 | 0.851 | 0.8541 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
- | No log | 2.0 | 14 | 21.9754 | 0.1969 | 0.0569 | 0.1636 | 0.1645 | 20.0 | 0.0224 | 0.0731 | 0.5394 | 0.6183 | 755.0 | 1221.0 | 0.8592 | 0.8509 | 0.855 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
- | No log | 3.0 | 21 | 19.9990 | 0.1985 | 0.0587 | 0.1654 | 0.1668 | 20.0 | 0.0232 | 0.0761 | 0.5337 | 0.6143 | 750.0 | 1221.0 | 0.8595 | 0.8515 | 0.8554 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
- | No log | 4.0 | 28 | 18.4263 | 0.2026 | 0.0619 | 0.1653 | 0.1667 | 20.0 | 0.0276 | 0.0789 | 0.5348 | 0.6151 | 751.0 | 1221.0 | 0.8605 | 0.8527 | 0.8565 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
- | No log | 5.0 | 35 | 17.0775 | 0.2151 | 0.0694 | 0.1729 | 0.1749 | 20.0 | 0.0312 | 0.0861 | 0.5337 | 0.6143 | 750.0 | 1221.0 | 0.8625 | 0.8536 | 0.8579 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
- | No log | 6.0 | 42 | 15.8378 | 0.2195 | 0.0728 | 0.1795 | 0.1804 | 20.0 | 0.0323 | 0.0887 | 0.5348 | 0.6151 | 751.0 | 1221.0 | 0.8644 | 0.8547 | 0.8595 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
- | No log | 7.0 | 49 | 14.6478 | 0.2168 | 0.0728 | 0.1764 | 0.1774 | 20.0 | 0.0386 | 0.0941 | 0.5429 | 0.6208 | 758.0 | 1221.0 | 0.8638 | 0.8536 | 0.8585 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
- | No log | 8.0 | 56 | 13.5488 | 0.2166 | 0.0728 | 0.1752 | 0.1761 | 20.0 | 0.0389 | 0.0944 | 0.5487 | 0.6249 | 763.0 | 1221.0 | 0.8637 | 0.8539 | 0.8587 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
- | No log | 9.0 | 63 | 12.7028 | 0.2175 | 0.0695 | 0.1742 | 0.1755 | 20.0 | 0.037 | 0.0916 | 0.5487 | 0.6249 | 763.0 | 1221.0 | 0.8629 | 0.8534 | 0.858 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
- | No log | 10.0 | 70 | 12.3467 | 0.2224 | 0.0751 | 0.1809 | 0.183 | 20.0 | 0.0399 | 0.0959 | 0.5487 | 0.6249 | 763.0 | 1221.0 | 0.8649 | 0.8546 | 0.8596 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
 
 
 
 
 
83
 
84
 
85
  ### Framework versions
 
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
+ - Loss: 4.2146
26
+ - Rouge1: 0.136
27
+ - Rouge2: 0.0285
28
+ - Rougel: 0.1045
29
+ - Rougelsum: 0.1042
30
  - Gen Len: 20.0
31
+ - Bleu: 0.0065
32
+ - Precisions: 0.0567
33
+ - Brevity Penalty: 0.3576
34
+ - Length Ratio: 0.493
35
+ - Translation Length: 602.0
36
  - Reference Length: 1221.0
37
+ - Precision: 0.817
38
+ - Recall: 0.8329
39
+ - F1: 0.8246
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
 
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
+ - num_epochs: 15
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
+ | No log | 1.0 | 7 | 26.0662 | 0.2073 | 0.057 | 0.1688 | 0.1691 | 20.0 | 0.0198 | 0.0737 | 0.5394 | 0.6183 | 755.0 | 1221.0 | 0.8581 | 0.8517 | 0.8548 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
+ | No log | 2.0 | 14 | 23.4955 | 0.185 | 0.0471 | 0.1522 | 0.152 | 20.0 | 0.0162 | 0.0646 | 0.5325 | 0.6134 | 749.0 | 1221.0 | 0.8547 | 0.8489 | 0.8517 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
+ | No log | 3.0 | 21 | 21.3631 | 0.1825 | 0.0477 | 0.1523 | 0.1522 | 20.0 | 0.0157 | 0.0622 | 0.529 | 0.611 | 746.0 | 1221.0 | 0.8555 | 0.8485 | 0.8519 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
+ | No log | 4.0 | 28 | 19.5442 | 0.1915 | 0.0533 | 0.1588 | 0.1592 | 20.0 | 0.0153 | 0.0655 | 0.5337 | 0.6143 | 750.0 | 1221.0 | 0.8575 | 0.8506 | 0.8539 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
+ | No log | 5.0 | 35 | 17.8808 | 0.19 | 0.0507 | 0.1588 | 0.159 | 20.0 | 0.0153 | 0.0668 | 0.529 | 0.611 | 746.0 | 1221.0 | 0.8569 | 0.8499 | 0.8533 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
+ | No log | 6.0 | 42 | 16.2547 | 0.1844 | 0.0457 | 0.1508 | 0.1509 | 20.0 | 0.0127 | 0.0626 | 0.529 | 0.611 | 746.0 | 1221.0 | 0.8548 | 0.8482 | 0.8514 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
+ | No log | 7.0 | 49 | 14.4184 | 0.1887 | 0.0511 | 0.1559 | 0.1562 | 20.0 | 0.0198 | 0.0693 | 0.529 | 0.611 | 746.0 | 1221.0 | 0.8545 | 0.8486 | 0.8515 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
+ | No log | 8.0 | 56 | 11.9307 | 0.193 | 0.0529 | 0.1567 | 0.156 | 20.0 | 0.0263 | 0.0774 | 0.5232 | 0.6069 | 741.0 | 1221.0 | 0.8526 | 0.8467 | 0.8496 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
+ | No log | 9.0 | 63 | 8.0749 | 0.2082 | 0.0606 | 0.1658 | 0.1642 | 20.0 | 0.0287 | 0.0825 | 0.5162 | 0.602 | 735.0 | 1221.0 | 0.8521 | 0.848 | 0.85 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
+ | No log | 10.0 | 70 | 4.2463 | 0.1488 | 0.0238 | 0.1176 | 0.1169 | 20.0 | 0.0 | 0.0454 | 0.4798 | 0.5766 | 704.0 | 1221.0 | 0.8255 | 0.8365 | 0.8308 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
+ | No log | 11.0 | 77 | 4.1345 | 0.1271 | 0.0302 | 0.0991 | 0.0988 | 20.0 | 0.0073 | 0.0573 | 0.3697 | 0.5012 | 612.0 | 1221.0 | 0.8102 | 0.8285 | 0.819 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
84
+ | No log | 12.0 | 84 | 4.2879 | 0.1217 | 0.0281 | 0.0991 | 0.0991 | 20.0 | 0.006 | 0.0582 | 0.2952 | 0.4505 | 550.0 | 1221.0 | 0.808 | 0.8265 | 0.8169 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
85
+ | No log | 13.0 | 91 | 4.3117 | 0.1414 | 0.0341 | 0.1079 | 0.1072 | 20.0 | 0.0069 | 0.061 | 0.3203 | 0.4676 | 571.0 | 1221.0 | 0.8151 | 0.8311 | 0.8228 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
86
+ | No log | 14.0 | 98 | 4.2535 | 0.1306 | 0.0252 | 0.0951 | 0.095 | 20.0 | 0.0063 | 0.0541 | 0.354 | 0.4906 | 599.0 | 1221.0 | 0.8155 | 0.8314 | 0.8232 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
87
+ | No log | 15.0 | 105 | 4.2146 | 0.136 | 0.0285 | 0.1045 | 0.1042 | 20.0 | 0.0065 | 0.0567 | 0.3576 | 0.493 | 602.0 | 1221.0 | 0.817 | 0.8329 | 0.8246 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
88
 
89
 
90
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:172f1bc3bab23400d5e32f2d8fc70ec4a6547b5714490d2640c343429a8d98a7
3
  size 1187780840
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a290064e280e4f021c894bdb6108e2564a85a9399219cc48f31a7af6ce4f385
3
  size 1187780840
runs/Jul09_11-46-56_tardis/events.out.tfevents.1752054418.tardis.45940.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8f4a5dc06afe813b69175810ef700654ce3504d4c45024f38a672b453deaabd4
3
+ size 22679
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d86a018b61c4a336e2fd3a3c1c4e3a12f4fdafc286261cd51feeacbd690d4560
3
  size 5905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:394cf6c493731e254648f86939f024c0fc44bbb2fd113aab5a6cdbff1a3eda41
3
  size 5905