End of training
Browse files- README.md +30 -25
- adapter_config.json +2 -2
- adapter_model.safetensors +1 -1
- runs/Jul29_10-56-25_tardis/events.out.tfevents.1753779387.tardis.17767.0 +3 -0
- training_args.bin +1 -1
README.md
CHANGED
|
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 22 |
|
| 23 |
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
|
| 24 |
It achieves the following results on the evaluation set:
|
| 25 |
-
- Loss:
|
| 26 |
-
- Rouge1: 0.
|
| 27 |
-
- Rouge2: 0.
|
| 28 |
-
- Rougel: 0.
|
| 29 |
-
- Rougelsum: 0.
|
| 30 |
-
- Gen Len:
|
| 31 |
-
- Bleu: 0.
|
| 32 |
-
- Precisions: 0.
|
| 33 |
-
- Brevity Penalty: 0.
|
| 34 |
-
- Length Ratio: 0.
|
| 35 |
-
- Translation Length:
|
| 36 |
- Reference Length: 1208.0
|
| 37 |
-
- Precision: 0.
|
| 38 |
-
- Recall: 0.
|
| 39 |
-
- F1: 0.
|
| 40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
| 41 |
|
| 42 |
## Model description
|
|
@@ -64,22 +64,27 @@ The following hyperparameters were used during training:
|
|
| 64 |
- total_train_batch_size: 16
|
| 65 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 66 |
- lr_scheduler_type: linear
|
| 67 |
-
- num_epochs:
|
| 68 |
|
| 69 |
### Training results
|
| 70 |
|
| 71 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
| 72 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
| 73 |
-
| 25.
|
| 74 |
-
|
|
| 75 |
-
| 5.
|
| 76 |
-
| 4.
|
| 77 |
-
| 3.
|
| 78 |
-
| 3.
|
| 79 |
-
| 3.
|
| 80 |
-
|
|
| 81 |
-
| 2.
|
| 82 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
|
| 84 |
|
| 85 |
### Framework versions
|
|
|
|
| 22 |
|
| 23 |
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
|
| 24 |
It achieves the following results on the evaluation set:
|
| 25 |
+
- Loss: 1.2179
|
| 26 |
+
- Rouge1: 0.2731
|
| 27 |
+
- Rouge2: 0.0923
|
| 28 |
+
- Rougel: 0.2191
|
| 29 |
+
- Rougelsum: 0.2187
|
| 30 |
+
- Gen Len: 28.36
|
| 31 |
+
- Bleu: 0.0405
|
| 32 |
+
- Precisions: 0.0907
|
| 33 |
+
- Brevity Penalty: 0.8678
|
| 34 |
+
- Length Ratio: 0.8758
|
| 35 |
+
- Translation Length: 1058.0
|
| 36 |
- Reference Length: 1208.0
|
| 37 |
+
- Precision: 0.8678
|
| 38 |
+
- Recall: 0.8682
|
| 39 |
+
- F1: 0.868
|
| 40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
| 41 |
|
| 42 |
## Model description
|
|
|
|
| 64 |
- total_train_batch_size: 16
|
| 65 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 66 |
- lr_scheduler_type: linear
|
| 67 |
+
- num_epochs: 15
|
| 68 |
|
| 69 |
### Training results
|
| 70 |
|
| 71 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
| 72 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
| 73 |
+
| 25.0003 | 1.0 | 7 | 20.5904 | 0.1617 | 0.0323 | 0.1218 | 0.122 | 30.9 | 0.0115 | 0.0407 | 0.9438 | 0.9454 | 1142.0 | 1208.0 | 0.8375 | 0.8486 | 0.843 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 74 |
+
| 13.5817 | 2.0 | 14 | 6.1579 | 0.0 | 0.0 | 0.0 | 0.0 | 31.0 | 0.0 | 0.0 | 0.0163 | 0.1954 | 236.0 | 1208.0 | 0.685 | 0.8056 | 0.7402 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 75 |
+
| 5.717 | 3.0 | 21 | 4.7056 | 0.1638 | 0.0259 | 0.1312 | 0.1317 | 30.52 | 0.0 | 0.04 | 0.8678 | 0.8758 | 1058.0 | 1208.0 | 0.836 | 0.8456 | 0.8407 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 76 |
+
| 4.3672 | 4.0 | 28 | 3.9200 | 0.1752 | 0.0441 | 0.1429 | 0.1431 | 30.58 | 0.0227 | 0.0455 | 1.0 | 1.0157 | 1227.0 | 1208.0 | 0.8309 | 0.8494 | 0.8399 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 77 |
+
| 3.7524 | 5.0 | 35 | 3.5438 | 0.1996 | 0.0506 | 0.1568 | 0.1571 | 29.36 | 0.0242 | 0.0593 | 0.8762 | 0.8833 | 1067.0 | 1208.0 | 0.8468 | 0.8547 | 0.8506 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 78 |
+
| 3.3815 | 6.0 | 42 | 3.2724 | 0.2124 | 0.0482 | 0.1594 | 0.1596 | 29.9 | 0.0181 | 0.0585 | 0.9092 | 0.9131 | 1103.0 | 1208.0 | 0.8468 | 0.8541 | 0.8504 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 79 |
+
| 3.0636 | 7.0 | 49 | 2.8208 | 0.2145 | 0.0486 | 0.163 | 0.1635 | 29.86 | 0.0182 | 0.0588 | 0.9191 | 0.9222 | 1114.0 | 1208.0 | 0.8457 | 0.8527 | 0.8491 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 80 |
+
| 2.615 | 8.0 | 56 | 2.1060 | 0.2136 | 0.0477 | 0.1639 | 0.1647 | 29.22 | 0.0204 | 0.0621 | 0.8744 | 0.8816 | 1065.0 | 1208.0 | 0.8512 | 0.8556 | 0.8533 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 81 |
+
| 2.1516 | 9.0 | 63 | 1.6387 | 0.2264 | 0.059 | 0.1764 | 0.1773 | 29.3 | 0.0286 | 0.0681 | 0.8983 | 0.9031 | 1091.0 | 1208.0 | 0.854 | 0.8574 | 0.8556 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 82 |
+
| 1.8434 | 10.0 | 70 | 1.3553 | 0.2393 | 0.0651 | 0.1883 | 0.188 | 28.76 | 0.0258 | 0.0715 | 0.8781 | 0.8849 | 1069.0 | 1208.0 | 0.8572 | 0.86 | 0.8585 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 83 |
+
| 1.5226 | 11.0 | 77 | 1.3014 | 0.2528 | 0.0743 | 0.1974 | 0.1977 | 28.8 | 0.0309 | 0.0776 | 0.8836 | 0.8899 | 1075.0 | 1208.0 | 0.8637 | 0.8638 | 0.8637 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 84 |
+
| 1.3897 | 12.0 | 84 | 1.2445 | 0.2544 | 0.0788 | 0.1998 | 0.2008 | 28.62 | 0.0312 | 0.0788 | 0.8716 | 0.8791 | 1062.0 | 1208.0 | 0.8637 | 0.8638 | 0.8636 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 85 |
+
| 1.3352 | 13.0 | 91 | 1.2288 | 0.2643 | 0.0898 | 0.2118 | 0.211 | 28.4 | 0.0392 | 0.0873 | 0.865 | 0.8733 | 1055.0 | 1208.0 | 0.8668 | 0.8666 | 0.8666 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 86 |
+
| 1.2825 | 14.0 | 98 | 1.2213 | 0.2708 | 0.0932 | 0.219 | 0.2187 | 28.28 | 0.0407 | 0.0907 | 0.8622 | 0.8709 | 1052.0 | 1208.0 | 0.8686 | 0.868 | 0.8683 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 87 |
+
| 1.2418 | 15.0 | 105 | 1.2179 | 0.2731 | 0.0923 | 0.2191 | 0.2187 | 28.36 | 0.0405 | 0.0907 | 0.8678 | 0.8758 | 1058.0 | 1208.0 | 0.8678 | 0.8682 | 0.868 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 88 |
|
| 89 |
|
| 90 |
### Framework versions
|
adapter_config.json
CHANGED
|
@@ -24,10 +24,10 @@
|
|
| 24 |
"rank_pattern": {},
|
| 25 |
"revision": null,
|
| 26 |
"target_modules": [
|
| 27 |
-
"k",
|
| 28 |
"q",
|
| 29 |
"v",
|
| 30 |
-
"o"
|
|
|
|
| 31 |
],
|
| 32 |
"task_type": "SEQ_2_SEQ_LM",
|
| 33 |
"trainable_token_indices": null,
|
|
|
|
| 24 |
"rank_pattern": {},
|
| 25 |
"revision": null,
|
| 26 |
"target_modules": [
|
|
|
|
| 27 |
"q",
|
| 28 |
"v",
|
| 29 |
+
"o",
|
| 30 |
+
"k"
|
| 31 |
],
|
| 32 |
"task_type": "SEQ_2_SEQ_LM",
|
| 33 |
"trainable_token_indices": null,
|
adapter_model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 7119264
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:1b79b8786a76983c98b74b5ce0a25e3d7460c2be872ac952b44122147b8d1cd2
|
| 3 |
size 7119264
|
runs/Jul29_10-56-25_tardis/events.out.tfevents.1753779387.tardis.17767.0
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:f92ac9c57c4ea4fb9889a05628855916105bac5d8346f05f65e7b9f4f0e87417
|
| 3 |
+
size 25800
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 5905
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b61e7652eea5213f6491058900d4ecbe0a01af98d214ba9bd7ec7eee0d1e22bd
|
| 3 |
size 5905
|