floflodebilbao commited on
Commit
78eaf89
·
verified ·
1 Parent(s): 90de610

End of training

Browse files
README.md CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
- - Loss: 6.3965
26
- - Rouge1: 0.263
27
- - Rouge2: 0.0882
28
- - Rougel: 0.1876
29
- - Rougelsum: 0.1883
30
- - Gen Len: 58.26
31
- - Bleu: 0.0486
32
- - Precisions: 0.0706
33
  - Brevity Penalty: 1.0
34
- - Length Ratio: 1.6921
35
- - Translation Length: 2066.0
36
  - Reference Length: 1221.0
37
- - Precision: 0.8384
38
- - Recall: 0.8672
39
- - F1: 0.8525
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
@@ -64,24 +64,27 @@ The following hyperparameters were used during training:
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
- - num_epochs: 12
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
- | No log | 1.0 | 7 | 16.0136 | 0.2808 | 0.0965 | 0.2152 | 0.2144 | 62.06 | 0.0442 | 0.0695 | 1.0 | 1.792 | 2188.0 | 1221.0 | 0.8383 | 0.8742 | 0.8558 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
- | No log | 2.0 | 14 | 14.6056 | 0.2678 | 0.0858 | 0.2029 | 0.2024 | 61.62 | 0.0382 | 0.0643 | 1.0 | 1.7682 | 2159.0 | 1221.0 | 0.8364 | 0.8711 | 0.8533 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
- | No log | 3.0 | 21 | 13.4680 | 0.2748 | 0.0925 | 0.2057 | 0.2064 | 60.4 | 0.0413 | 0.0668 | 1.0 | 1.7707 | 2162.0 | 1221.0 | 0.8408 | 0.8727 | 0.8564 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
- | No log | 4.0 | 28 | 12.5233 | 0.2788 | 0.0978 | 0.2096 | 0.2105 | 59.46 | 0.0446 | 0.0706 | 1.0 | 1.7428 | 2128.0 | 1221.0 | 0.8421 | 0.8729 | 0.8571 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
- | No log | 5.0 | 35 | 11.6894 | 0.2761 | 0.0969 | 0.2069 | 0.2063 | 59.28 | 0.0444 | 0.07 | 1.0 | 1.7363 | 2120.0 | 1221.0 | 0.842 | 0.8724 | 0.8568 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
- | No log | 6.0 | 42 | 10.9117 | 0.2817 | 0.1021 | 0.2067 | 0.2072 | 59.18 | 0.0474 | 0.0725 | 1.0 | 1.7166 | 2096.0 | 1221.0 | 0.8425 | 0.8737 | 0.8577 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
- | No log | 7.0 | 49 | 10.1201 | 0.2868 | 0.1096 | 0.2129 | 0.2126 | 59.34 | 0.0524 | 0.0765 | 1.0 | 1.7183 | 2098.0 | 1221.0 | 0.8433 | 0.8747 | 0.8586 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
- | No log | 8.0 | 56 | 9.2986 | 0.2713 | 0.1036 | 0.1995 | 0.1995 | 59.66 | 0.0517 | 0.0738 | 1.0 | 1.7191 | 2099.0 | 1221.0 | 0.8396 | 0.8713 | 0.8551 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
- | No log | 9.0 | 63 | 8.3929 | 0.2673 | 0.0984 | 0.1922 | 0.1927 | 59.66 | 0.0488 | 0.0715 | 1.0 | 1.7346 | 2118.0 | 1221.0 | 0.8376 | 0.8702 | 0.8535 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
- | No log | 10.0 | 70 | 7.4642 | 0.2635 | 0.0912 | 0.19 | 0.1897 | 57.88 | 0.0477 | 0.0709 | 1.0 | 1.697 | 2072.0 | 1221.0 | 0.8394 | 0.868 | 0.8534 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
- | No log | 11.0 | 77 | 6.7157 | 0.2657 | 0.0892 | 0.1878 | 0.1885 | 58.56 | 0.0489 | 0.0711 | 1.0 | 1.7093 | 2087.0 | 1221.0 | 0.8391 | 0.8678 | 0.8531 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
84
- | No log | 12.0 | 84 | 6.3965 | 0.263 | 0.0882 | 0.1876 | 0.1883 | 58.26 | 0.0486 | 0.0706 | 1.0 | 1.6921 | 2066.0 | 1221.0 | 0.8384 | 0.8672 | 0.8525 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
 
 
 
85
 
86
 
87
  ### Framework versions
 
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
+ - Loss: 4.2146
26
+ - Rouge1: 0.1393
27
+ - Rouge2: 0.0277
28
+ - Rougel: 0.0969
29
+ - Rougelsum: 0.0972
30
+ - Gen Len: 62.6
31
+ - Bleu: 0.0075
32
+ - Precisions: 0.0304
33
  - Brevity Penalty: 1.0
34
+ - Length Ratio: 1.6642
35
+ - Translation Length: 2032.0
36
  - Reference Length: 1221.0
37
+ - Precision: 0.7816
38
+ - Recall: 0.8361
39
+ - F1: 0.8076
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
 
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
+ - num_epochs: 15
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
+ | No log | 1.0 | 7 | 26.0662 | 0.2738 | 0.0927 | 0.2091 | 0.2076 | 62.06 | 0.0413 | 0.0659 | 1.0 | 1.8026 | 2201.0 | 1221.0 | 0.8359 | 0.8722 | 0.8536 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
+ | No log | 2.0 | 14 | 23.4955 | 0.2701 | 0.0901 | 0.2019 | 0.2013 | 61.62 | 0.0408 | 0.0663 | 1.0 | 1.7772 | 2170.0 | 1221.0 | 0.8376 | 0.8724 | 0.8546 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
+ | No log | 3.0 | 21 | 21.3631 | 0.2632 | 0.0908 | 0.2003 | 0.2002 | 60.82 | 0.0412 | 0.0657 | 1.0 | 1.7609 | 2150.0 | 1221.0 | 0.8383 | 0.8706 | 0.854 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
+ | No log | 4.0 | 28 | 19.5442 | 0.275 | 0.0983 | 0.2073 | 0.2078 | 59.64 | 0.0445 | 0.07 | 1.0 | 1.7387 | 2123.0 | 1221.0 | 0.8413 | 0.8726 | 0.8566 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
+ | No log | 5.0 | 35 | 17.8808 | 0.2668 | 0.0941 | 0.199 | 0.1997 | 59.64 | 0.0437 | 0.0681 | 1.0 | 1.7363 | 2120.0 | 1221.0 | 0.8391 | 0.8697 | 0.854 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
+ | No log | 6.0 | 42 | 16.2547 | 0.2626 | 0.0885 | 0.1925 | 0.1934 | 59.42 | 0.0399 | 0.0642 | 1.0 | 1.7248 | 2106.0 | 1221.0 | 0.838 | 0.8682 | 0.8528 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
+ | No log | 7.0 | 49 | 14.4184 | 0.2581 | 0.0873 | 0.1848 | 0.1857 | 59.4 | 0.0412 | 0.0648 | 1.0 | 1.7224 | 2103.0 | 1221.0 | 0.8373 | 0.8689 | 0.8527 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
+ | No log | 8.0 | 56 | 11.9307 | 0.2614 | 0.0932 | 0.1875 | 0.1881 | 59.06 | 0.0478 | 0.0696 | 1.0 | 1.7191 | 2099.0 | 1221.0 | 0.8369 | 0.8688 | 0.8525 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
+ | No log | 9.0 | 63 | 8.0749 | 0.253 | 0.0819 | 0.182 | 0.1827 | 56.46 | 0.0471 | 0.0686 | 1.0 | 1.6355 | 1997.0 | 1221.0 | 0.837 | 0.864 | 0.8502 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
+ | No log | 10.0 | 70 | 4.2463 | 0.169 | 0.0318 | 0.1209 | 0.1214 | 57.5 | 0.0065 | 0.0337 | 1.0 | 1.5995 | 1953.0 | 1221.0 | 0.8068 | 0.8444 | 0.8249 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
+ | No log | 11.0 | 77 | 4.1345 | 0.1156 | 0.0266 | 0.0848 | 0.0857 | 59.86 | 0.007 | 0.0268 | 1.0 | 1.5119 | 1846.0 | 1221.0 | 0.7703 | 0.828 | 0.7976 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
84
+ | No log | 12.0 | 84 | 4.2879 | 0.1025 | 0.024 | 0.0832 | 0.0839 | 63.0 | 0.0081 | 0.0261 | 1.0 | 1.5111 | 1845.0 | 1221.0 | 0.7695 | 0.8276 | 0.797 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
85
+ | No log | 13.0 | 91 | 4.3117 | 0.1259 | 0.0306 | 0.0945 | 0.0953 | 63.0 | 0.0086 | 0.0297 | 1.0 | 1.6126 | 1969.0 | 1221.0 | 0.7731 | 0.8332 | 0.8015 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
86
+ | No log | 14.0 | 98 | 4.2535 | 0.1287 | 0.0281 | 0.0874 | 0.0876 | 63.0 | 0.0077 | 0.0286 | 1.0 | 1.6847 | 2057.0 | 1221.0 | 0.7804 | 0.835 | 0.8064 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
87
+ | No log | 15.0 | 105 | 4.2146 | 0.1393 | 0.0277 | 0.0969 | 0.0972 | 62.6 | 0.0075 | 0.0304 | 1.0 | 1.6642 | 2032.0 | 1221.0 | 0.7816 | 0.8361 | 0.8076 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
88
 
89
 
90
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5071af394842342deab1733f4cf6d2b20a058394a5fced738404c55f12f5edb8
3
  size 1187780840
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8a290064e280e4f021c894bdb6108e2564a85a9399219cc48f31a7af6ce4f385
3
  size 1187780840
runs/Jul10_11-29-30_tardis/events.out.tfevents.1752139771.tardis.31127.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a982942638605353a193ba3939d9c490aa732b0e97af5ee51e22818fd2476c5b
3
+ size 22677
tokenizer.json CHANGED
@@ -7,7 +7,9 @@
7
  "stride": 0
8
  },
9
  "padding": {
10
- "strategy": "BatchLongest",
 
 
11
  "direction": "Right",
12
  "pad_to_multiple_of": null,
13
  "pad_id": 0,
 
7
  "stride": 0
8
  },
9
  "padding": {
10
+ "strategy": {
11
+ "Fixed": 64
12
+ },
13
  "direction": "Right",
14
  "pad_to_multiple_of": null,
15
  "pad_id": 0,
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8a2d556348a921680183ad1254c2a5b9584016181e8d4bf82b12d8f99191d41a
3
  size 5905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0cd4a86a23a38d5031a3fb7df063cff335a8a4f717125ba0f5d4aed3653f24e0
3
  size 5905