floflodebilbao commited on
Commit
1d7fb44
·
verified ·
1 Parent(s): 01af0d8

End of training

Browse files
README.md CHANGED
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
- - Loss: 2.5549
26
- - Rouge1: 0.2119
27
- - Rouge2: 0.0501
28
- - Rougel: 0.1629
29
- - Rougelsum: 0.1634
30
- - Gen Len: 29.52
31
- - Bleu: 0.0187
32
- - Precisions: 0.0592
33
- - Brevity Penalty: 0.9137
34
- - Length Ratio: 0.9172
35
- - Translation Length: 1108.0
36
  - Reference Length: 1208.0
37
- - Precision: 0.8484
38
- - Recall: 0.8543
39
- - F1: 0.8513
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
@@ -64,22 +64,27 @@ The following hyperparameters were used during training:
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
- - num_epochs: 10
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
- | 25.3187 | 1.0 | 7 | 21.7613 | 0.1761 | 0.0329 | 0.1284 | 0.1286 | 30.9 | 0.0122 | 0.0432 | 0.9333 | 0.9354 | 1130.0 | 1208.0 | 0.8387 | 0.8501 | 0.8443 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
- | 15.331 | 2.0 | 14 | 5.3421 | 0.043 | 0.0097 | 0.0389 | 0.039 | 30.88 | 0.0 | 0.02 | 0.408 | 0.5273 | 637.0 | 1208.0 | 0.7369 | 0.818 | 0.7744 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
- | 5.4959 | 3.0 | 21 | 4.6619 | 0.152 | 0.0198 | 0.1175 | 0.1179 | 30.66 | 0.0 | 0.0365 | 0.8697 | 0.8775 | 1060.0 | 1208.0 | 0.8309 | 0.8452 | 0.8379 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
- | 4.3485 | 4.0 | 28 | 3.9824 | 0.1785 | 0.0373 | 0.1428 | 0.1425 | 30.1 | 0.0124 | 0.0475 | 0.8937 | 0.899 | 1086.0 | 1208.0 | 0.8338 | 0.8482 | 0.8408 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
- | 3.7645 | 5.0 | 35 | 3.5558 | 0.1881 | 0.0385 | 0.1489 | 0.1484 | 29.78 | 0.0132 | 0.0506 | 0.9182 | 0.9214 | 1113.0 | 1208.0 | 0.8463 | 0.8533 | 0.8497 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
- | 3.4345 | 6.0 | 42 | 3.3686 | 0.1986 | 0.0426 | 0.1524 | 0.1525 | 30.7 | 0.0177 | 0.0523 | 0.9315 | 0.9338 | 1128.0 | 1208.0 | 0.8434 | 0.8532 | 0.8482 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
- | 3.2147 | 7.0 | 49 | 3.1500 | 0.1973 | 0.0412 | 0.1555 | 0.1563 | 30.1 | 0.0166 | 0.0532 | 0.9137 | 0.9172 | 1108.0 | 1208.0 | 0.8454 | 0.8526 | 0.8489 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
- | 3.0107 | 8.0 | 56 | 2.8957 | 0.2008 | 0.0464 | 0.1574 | 0.1575 | 29.84 | 0.0174 | 0.0567 | 0.901 | 0.9056 | 1094.0 | 1208.0 | 0.847 | 0.8528 | 0.8498 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
- | 2.8102 | 9.0 | 63 | 2.6612 | 0.2108 | 0.0484 | 0.1629 | 0.1636 | 29.62 | 0.0182 | 0.0582 | 0.9155 | 0.9189 | 1110.0 | 1208.0 | 0.8482 | 0.854 | 0.851 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
- | 2.6845 | 10.0 | 70 | 2.5549 | 0.2119 | 0.0501 | 0.1629 | 0.1634 | 29.52 | 0.0187 | 0.0592 | 0.9137 | 0.9172 | 1108.0 | 1208.0 | 0.8484 | 0.8543 | 0.8513 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
 
 
 
 
 
83
 
84
 
85
  ### Framework versions
 
22
 
23
  This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
24
  It achieves the following results on the evaluation set:
25
+ - Loss: 1.2179
26
+ - Rouge1: 0.2731
27
+ - Rouge2: 0.0923
28
+ - Rougel: 0.2191
29
+ - Rougelsum: 0.2187
30
+ - Gen Len: 28.36
31
+ - Bleu: 0.0405
32
+ - Precisions: 0.0907
33
+ - Brevity Penalty: 0.8678
34
+ - Length Ratio: 0.8758
35
+ - Translation Length: 1058.0
36
  - Reference Length: 1208.0
37
+ - Precision: 0.8678
38
+ - Recall: 0.8682
39
+ - F1: 0.868
40
  - Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
41
 
42
  ## Model description
 
64
  - total_train_batch_size: 16
65
  - optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
66
  - lr_scheduler_type: linear
67
+ - num_epochs: 15
68
 
69
  ### Training results
70
 
71
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
72
  |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
73
+ | 25.0003 | 1.0 | 7 | 20.5904 | 0.1617 | 0.0323 | 0.1218 | 0.122 | 30.9 | 0.0115 | 0.0407 | 0.9438 | 0.9454 | 1142.0 | 1208.0 | 0.8375 | 0.8486 | 0.843 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
74
+ | 13.5817 | 2.0 | 14 | 6.1579 | 0.0 | 0.0 | 0.0 | 0.0 | 31.0 | 0.0 | 0.0 | 0.0163 | 0.1954 | 236.0 | 1208.0 | 0.685 | 0.8056 | 0.7402 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
75
+ | 5.717 | 3.0 | 21 | 4.7056 | 0.1638 | 0.0259 | 0.1312 | 0.1317 | 30.52 | 0.0 | 0.04 | 0.8678 | 0.8758 | 1058.0 | 1208.0 | 0.836 | 0.8456 | 0.8407 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
76
+ | 4.3672 | 4.0 | 28 | 3.9200 | 0.1752 | 0.0441 | 0.1429 | 0.1431 | 30.58 | 0.0227 | 0.0455 | 1.0 | 1.0157 | 1227.0 | 1208.0 | 0.8309 | 0.8494 | 0.8399 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
77
+ | 3.7524 | 5.0 | 35 | 3.5438 | 0.1996 | 0.0506 | 0.1568 | 0.1571 | 29.36 | 0.0242 | 0.0593 | 0.8762 | 0.8833 | 1067.0 | 1208.0 | 0.8468 | 0.8547 | 0.8506 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
78
+ | 3.3815 | 6.0 | 42 | 3.2724 | 0.2124 | 0.0482 | 0.1594 | 0.1596 | 29.9 | 0.0181 | 0.0585 | 0.9092 | 0.9131 | 1103.0 | 1208.0 | 0.8468 | 0.8541 | 0.8504 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
79
+ | 3.0636 | 7.0 | 49 | 2.8208 | 0.2145 | 0.0486 | 0.163 | 0.1635 | 29.86 | 0.0182 | 0.0588 | 0.9191 | 0.9222 | 1114.0 | 1208.0 | 0.8457 | 0.8527 | 0.8491 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
80
+ | 2.615 | 8.0 | 56 | 2.1060 | 0.2136 | 0.0477 | 0.1639 | 0.1647 | 29.22 | 0.0204 | 0.0621 | 0.8744 | 0.8816 | 1065.0 | 1208.0 | 0.8512 | 0.8556 | 0.8533 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
81
+ | 2.1516 | 9.0 | 63 | 1.6387 | 0.2264 | 0.059 | 0.1764 | 0.1773 | 29.3 | 0.0286 | 0.0681 | 0.8983 | 0.9031 | 1091.0 | 1208.0 | 0.854 | 0.8574 | 0.8556 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
82
+ | 1.8434 | 10.0 | 70 | 1.3553 | 0.2393 | 0.0651 | 0.1883 | 0.188 | 28.76 | 0.0258 | 0.0715 | 0.8781 | 0.8849 | 1069.0 | 1208.0 | 0.8572 | 0.86 | 0.8585 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
83
+ | 1.5226 | 11.0 | 77 | 1.3014 | 0.2528 | 0.0743 | 0.1974 | 0.1977 | 28.8 | 0.0309 | 0.0776 | 0.8836 | 0.8899 | 1075.0 | 1208.0 | 0.8637 | 0.8638 | 0.8637 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
84
+ | 1.3897 | 12.0 | 84 | 1.2445 | 0.2544 | 0.0788 | 0.1998 | 0.2008 | 28.62 | 0.0312 | 0.0788 | 0.8716 | 0.8791 | 1062.0 | 1208.0 | 0.8637 | 0.8638 | 0.8636 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
85
+ | 1.3352 | 13.0 | 91 | 1.2288 | 0.2643 | 0.0898 | 0.2118 | 0.211 | 28.4 | 0.0392 | 0.0873 | 0.865 | 0.8733 | 1055.0 | 1208.0 | 0.8668 | 0.8666 | 0.8666 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
86
+ | 1.2825 | 14.0 | 98 | 1.2213 | 0.2708 | 0.0932 | 0.219 | 0.2187 | 28.28 | 0.0407 | 0.0907 | 0.8622 | 0.8709 | 1052.0 | 1208.0 | 0.8686 | 0.868 | 0.8683 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
87
+ | 1.2418 | 15.0 | 105 | 1.2179 | 0.2731 | 0.0923 | 0.2191 | 0.2187 | 28.36 | 0.0405 | 0.0907 | 0.8678 | 0.8758 | 1058.0 | 1208.0 | 0.8678 | 0.8682 | 0.868 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
88
 
89
 
90
  ### Framework versions
adapter_config.json CHANGED
@@ -24,10 +24,10 @@
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
27
- "k",
28
  "q",
29
  "v",
30
- "o"
 
31
  ],
32
  "task_type": "SEQ_2_SEQ_LM",
33
  "trainable_token_indices": null,
 
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
 
27
  "q",
28
  "v",
29
+ "o",
30
+ "k"
31
  ],
32
  "task_type": "SEQ_2_SEQ_LM",
33
  "trainable_token_indices": null,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d4f0253b2d4e013756c116b204fb4c660837a8393be04a095446d2578a81cd77
3
  size 7119264
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1b79b8786a76983c98b74b5ce0a25e3d7460c2be872ac952b44122147b8d1cd2
3
  size 7119264
runs/Jul29_10-56-25_tardis/events.out.tfevents.1753779387.tardis.17767.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f92ac9c57c4ea4fb9889a05628855916105bac5d8346f05f65e7b9f4f0e87417
3
+ size 25800
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e6264a400e8936cca885568aaf8291265e2d1264947e3e7aa37025276af03493
3
  size 5905
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b61e7652eea5213f6491058900d4ecbe0a01af98d214ba9bd7ec7eee0d1e22bd
3
  size 5905