floflodebilbao commited on
Commit
356f41e
·
verified ·
1 Parent(s): 1d7fb44

End of training

Browse files
README.md CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
- - Loss: 1.2179
26
- - Rouge1: 0.2731
27
- - Rouge2: 0.0923
28
- - Rougel: 0.2191
29
- - Rougelsum: 0.2187
30
- - Gen Len: 28.36
31
- - Bleu: 0.0405
32
- - Precisions: 0.0907
33
- - Brevity Penalty: 0.8678
34
- - Length Ratio: 0.8758
35
- - Translation Length: 1058.0
36
  - Reference Length: 1208.0
37
- - Precision: 0.8678
38
- - Recall: 0.8682
39
- - F1: 0.868
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
@@ -56,7 +56,7 @@ More information needed
56
  ### Training hyperparameters
57
 
58
  The following hyperparameters were used during training:
59
- - learning_rate: 0.001
60
  - train_batch_size: 1
61
  - eval_batch_size: 1
62
  - seed: 42
@@ -70,21 +70,21 @@ The following hyperparameters were used during training:
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
- | 25.0003 | 1.0 | 7 | 20.5904 | 0.1617 | 0.0323 | 0.1218 | 0.122 | 30.9 | 0.0115 | 0.0407 | 0.9438 | 0.9454 | 1142.0 | 1208.0 | 0.8375 | 0.8486 | 0.843 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
- | 13.5817 | 2.0 | 14 | 6.1579 | 0.0 | 0.0 | 0.0 | 0.0 | 31.0 | 0.0 | 0.0 | 0.0163 | 0.1954 | 236.0 | 1208.0 | 0.685 | 0.8056 | 0.7402 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
- | 5.717 | 3.0 | 21 | 4.7056 | 0.1638 | 0.0259 | 0.1312 | 0.1317 | 30.52 | 0.0 | 0.04 | 0.8678 | 0.8758 | 1058.0 | 1208.0 | 0.836 | 0.8456 | 0.8407 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
- | 4.3672 | 4.0 | 28 | 3.9200 | 0.1752 | 0.0441 | 0.1429 | 0.1431 | 30.58 | 0.0227 | 0.0455 | 1.0 | 1.0157 | 1227.0 | 1208.0 | 0.8309 | 0.8494 | 0.8399 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
- | 3.7524 | 5.0 | 35 | 3.5438 | 0.1996 | 0.0506 | 0.1568 | 0.1571 | 29.36 | 0.0242 | 0.0593 | 0.8762 | 0.8833 | 1067.0 | 1208.0 | 0.8468 | 0.8547 | 0.8506 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
- | 3.3815 | 6.0 | 42 | 3.2724 | 0.2124 | 0.0482 | 0.1594 | 0.1596 | 29.9 | 0.0181 | 0.0585 | 0.9092 | 0.9131 | 1103.0 | 1208.0 | 0.8468 | 0.8541 | 0.8504 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
- | 3.0636 | 7.0 | 49 | 2.8208 | 0.2145 | 0.0486 | 0.163 | 0.1635 | 29.86 | 0.0182 | 0.0588 | 0.9191 | 0.9222 | 1114.0 | 1208.0 | 0.8457 | 0.8527 | 0.8491 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
- | 2.615 | 8.0 | 56 | 2.1060 | 0.2136 | 0.0477 | 0.1639 | 0.1647 | 29.22 | 0.0204 | 0.0621 | 0.8744 | 0.8816 | 1065.0 | 1208.0 | 0.8512 | 0.8556 | 0.8533 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
- | 2.1516 | 9.0 | 63 | 1.6387 | 0.2264 | 0.059 | 0.1764 | 0.1773 | 29.3 | 0.0286 | 0.0681 | 0.8983 | 0.9031 | 1091.0 | 1208.0 | 0.854 | 0.8574 | 0.8556 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
- | 1.8434 | 10.0 | 70 | 1.3553 | 0.2393 | 0.0651 | 0.1883 | 0.188 | 28.76 | 0.0258 | 0.0715 | 0.8781 | 0.8849 | 1069.0 | 1208.0 | 0.8572 | 0.86 | 0.8585 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
- | 1.5226 | 11.0 | 77 | 1.3014 | 0.2528 | 0.0743 | 0.1974 | 0.1977 | 28.8 | 0.0309 | 0.0776 | 0.8836 | 0.8899 | 1075.0 | 1208.0 | 0.8637 | 0.8638 | 0.8637 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
84
- | 1.3897 | 12.0 | 84 | 1.2445 | 0.2544 | 0.0788 | 0.1998 | 0.2008 | 28.62 | 0.0312 | 0.0788 | 0.8716 | 0.8791 | 1062.0 | 1208.0 | 0.8637 | 0.8638 | 0.8636 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
85
- | 1.3352 | 13.0 | 91 | 1.2288 | 0.2643 | 0.0898 | 0.2118 | 0.211 | 28.4 | 0.0392 | 0.0873 | 0.865 | 0.8733 | 1055.0 | 1208.0 | 0.8668 | 0.8666 | 0.8666 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
86
- | 1.2825 | 14.0 | 98 | 1.2213 | 0.2708 | 0.0932 | 0.219 | 0.2187 | 28.28 | 0.0407 | 0.0907 | 0.8622 | 0.8709 | 1052.0 | 1208.0 | 0.8686 | 0.868 | 0.8683 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
87
- | 1.2418 | 15.0 | 105 | 1.2179 | 0.2731 | 0.0923 | 0.2191 | 0.2187 | 28.36 | 0.0405 | 0.0907 | 0.8678 | 0.8758 | 1058.0 | 1208.0 | 0.8678 | 0.8682 | 0.868 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
88
 
89
 
90
  ### Framework versions
 
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
+ - Loss: 1.1713
26
+ - Rouge1: 0.32
27
+ - Rouge2: 0.1176
28
+ - Rougel: 0.2492
29
+ - Rougelsum: 0.247
30
+ - Gen Len: 29.02
31
+ - Bleu: 0.0549
32
+ - Precisions: 0.1112
33
+ - Brevity Penalty: 0.8809
34
+ - Length Ratio: 0.8874
35
+ - Translation Length: 1072.0
36
  - Reference Length: 1208.0
37
+ - Precision: 0.8793
38
+ - Recall: 0.8771
39
+ - F1: 0.8781
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
 
56
  ### Training hyperparameters
57
 
58
  The following hyperparameters were used during training:
59
+ - learning_rate: 0.002
60
  - train_batch_size: 1
61
  - eval_batch_size: 1
62
  - seed: 42
 
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
+ | 22.0825 | 1.0 | 7 | 5.4800 | 0.0291 | 0.0123 | 0.0277 | 0.0269 | 31.0 | 0.0052 | 0.0246 | 0.3095 | 0.4603 | 556.0 | 1208.0 | 0.7219 | 0.8126 | 0.7641 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
+ | 5.6725 | 2.0 | 14 | 4.3850 | 0.1866 | 0.0375 | 0.1436 | 0.144 | 30.42 | 0.0 | 0.0501 | 0.8771 | 0.8841 | 1068.0 | 1208.0 | 0.8379 | 0.8489 | 0.8433 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
+ | 3.984 | 3.0 | 21 | 3.5872 | 0.2261 | 0.0642 | 0.1787 | 0.178 | 29.42 | 0.0293 | 0.0697 | 0.879 | 0.8858 | 1070.0 | 1208.0 | 0.8525 | 0.8587 | 0.8554 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
+ | 3.4293 | 4.0 | 28 | 3.1619 | 0.2643 | 0.0782 | 0.2047 | 0.2023 | 27.82 | 0.0343 | 0.0845 | 0.8451 | 0.856 | 1034.0 | 1208.0 | 0.8639 | 0.864 | 0.8639 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
+ | 2.9605 | 5.0 | 35 | 2.3087 | 0.2747 | 0.0788 | 0.2047 | 0.2033 | 28.2 | 0.0346 | 0.088 | 0.8518 | 0.8618 | 1041.0 | 1208.0 | 0.8645 | 0.8649 | 0.8646 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
+ | 2.4803 | 6.0 | 42 | 1.4041 | 0.3034 | 0.108 | 0.2334 | 0.2313 | 27.9 | 0.0481 | 0.1029 | 0.8584 | 0.8675 | 1048.0 | 1208.0 | 0.8735 | 0.8714 | 0.8724 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
+ | 1.4665 | 7.0 | 49 | 1.2358 | 0.3036 | 0.1109 | 0.2331 | 0.2316 | 28.1 | 0.047 | 0.1054 | 0.8706 | 0.8783 | 1061.0 | 1208.0 | 0.8761 | 0.8734 | 0.8747 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
+ | 1.2094 | 8.0 | 56 | 1.1991 | 0.3016 | 0.1141 | 0.2378 | 0.236 | 28.1 | 0.0469 | 0.1055 | 0.8631 | 0.8717 | 1053.0 | 1208.0 | 0.8777 | 0.8744 | 0.876 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
+ | 1.1101 | 9.0 | 63 | 1.1823 | 0.3041 | 0.1146 | 0.2466 | 0.2436 | 27.92 | 0.049 | 0.108 | 0.8365 | 0.8485 | 1025.0 | 1208.0 | 0.8793 | 0.8756 | 0.8774 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
+ | 1.2526 | 10.0 | 70 | 1.1754 | 0.3425 | 0.1338 | 0.2713 | 0.2702 | 27.94 | 0.0569 | 0.1207 | 0.8461 | 0.8568 | 1035.0 | 1208.0 | 0.8851 | 0.88 | 0.8825 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
+ | 1.0465 | 11.0 | 77 | 1.1696 | 0.3047 | 0.1143 | 0.2414 | 0.239 | 28.38 | 0.0527 | 0.1071 | 0.8659 | 0.8742 | 1056.0 | 1208.0 | 0.8787 | 0.875 | 0.8768 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
84
+ | 1.0647 | 12.0 | 84 | 1.1707 | 0.3154 | 0.1175 | 0.2473 | 0.2446 | 28.7 | 0.0525 | 0.109 | 0.8631 | 0.8717 | 1053.0 | 1208.0 | 0.8798 | 0.8756 | 0.8776 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
85
+ | 0.9992 | 13.0 | 91 | 1.1726 | 0.3177 | 0.1171 | 0.2468 | 0.2448 | 28.44 | 0.0538 | 0.11 | 0.8716 | 0.8791 | 1062.0 | 1208.0 | 0.879 | 0.8747 | 0.8768 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
86
+ | 0.9949 | 14.0 | 98 | 1.1712 | 0.3198 | 0.1172 | 0.2488 | 0.2464 | 28.98 | 0.0545 | 0.1103 | 0.8781 | 0.8849 | 1069.0 | 1208.0 | 0.8792 | 0.8768 | 0.8779 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
87
+ | 0.9798 | 15.0 | 105 | 1.1713 | 0.32 | 0.1176 | 0.2492 | 0.247 | 29.02 | 0.0549 | 0.1112 | 0.8809 | 0.8874 | 1072.0 | 1208.0 | 0.8793 | 0.8771 | 0.8781 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
88
 
89
 
90
  ### Framework versions
adapter_config.json CHANGED
@@ -24,9 +24,9 @@
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
27
- "q",
28
- "v",
29
  "o",
 
 
30
  "k"
31
  ],
32
  "task_type": "SEQ_2_SEQ_LM",
 
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
 
 
27
  "o",
28
+ "v",
29
+ "q",
30
  "k"
31
  ],
32
  "task_type": "SEQ_2_SEQ_LM",
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1b79b8786a76983c98b74b5ce0a25e3d7460c2be872ac952b44122147b8d1cd2
3
  size 7119264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:46d2a723141ef72f620aed7f02e62132244e735b1f78ad7b55b5299d344c700a
3
  size 7119264
runs/Jul29_11-23-34_tardis/events.out.tfevents.1753781016.tardis.17980.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e5e9d9088d8fbd29b2ba1e79b04aca1970b64b0b26050feab15079da2482664
3
+ size 25800
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b61e7652eea5213f6491058900d4ecbe0a01af98d214ba9bd7ec7eee0d1e22bd
3
  size 5905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5eb55aabc28197e14653437bd59dfdee16be0bf8b58a09630f45589591ea5296
3
  size 5905