End of training
Browse files- README.md +30 -30
- adapter_config.json +2 -2
- adapter_model.safetensors +1 -1
- runs/Jul29_11-23-34_tardis/events.out.tfevents.1753781016.tardis.17980.0 +3 -0
- training_args.bin +1 -1
README.md
CHANGED
|
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 22 |
|
| 23 |
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
|
| 24 |
It achieves the following results on the evaluation set:
|
| 25 |
-
- Loss: 1.
|
| 26 |
-
- Rouge1: 0.
|
| 27 |
-
- Rouge2: 0.
|
| 28 |
-
- Rougel: 0.
|
| 29 |
-
- Rougelsum: 0.
|
| 30 |
-
- Gen Len:
|
| 31 |
-
- Bleu: 0.
|
| 32 |
-
- Precisions: 0.
|
| 33 |
-
- Brevity Penalty: 0.
|
| 34 |
-
- Length Ratio: 0.
|
| 35 |
-
- Translation Length:
|
| 36 |
- Reference Length: 1208.0
|
| 37 |
-
- Precision: 0.
|
| 38 |
-
- Recall: 0.
|
| 39 |
-
- F1: 0.
|
| 40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
| 41 |
|
| 42 |
## Model description
|
|
@@ -56,7 +56,7 @@ More information needed
|
|
| 56 |
### Training hyperparameters
|
| 57 |
|
| 58 |
The following hyperparameters were used during training:
|
| 59 |
-
- learning_rate: 0.
|
| 60 |
- train_batch_size: 1
|
| 61 |
- eval_batch_size: 1
|
| 62 |
- seed: 42
|
|
@@ -70,21 +70,21 @@ The following hyperparameters were used during training:
|
|
| 70 |
|
| 71 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
| 72 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
| 73 |
-
|
|
| 74 |
-
|
|
| 75 |
-
|
|
| 76 |
-
|
|
| 77 |
-
|
|
| 78 |
-
|
|
| 79 |
-
|
|
| 80 |
-
|
|
| 81 |
-
|
|
| 82 |
-
| 1.
|
| 83 |
-
| 1.
|
| 84 |
-
| 1.
|
| 85 |
-
|
|
| 86 |
-
|
|
| 87 |
-
|
|
| 88 |
|
| 89 |
|
| 90 |
### Framework versions
|
|
|
|
| 22 |
|
| 23 |
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on an unknown dataset.
|
| 24 |
It achieves the following results on the evaluation set:
|
| 25 |
+
- Loss: 1.1713
|
| 26 |
+
- Rouge1: 0.32
|
| 27 |
+
- Rouge2: 0.1176
|
| 28 |
+
- Rougel: 0.2492
|
| 29 |
+
- Rougelsum: 0.247
|
| 30 |
+
- Gen Len: 29.02
|
| 31 |
+
- Bleu: 0.0549
|
| 32 |
+
- Precisions: 0.1112
|
| 33 |
+
- Brevity Penalty: 0.8809
|
| 34 |
+
- Length Ratio: 0.8874
|
| 35 |
+
- Translation Length: 1072.0
|
| 36 |
- Reference Length: 1208.0
|
| 37 |
+
- Precision: 0.8793
|
| 38 |
+
- Recall: 0.8771
|
| 39 |
+
- F1: 0.8781
|
| 40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
| 41 |
|
| 42 |
## Model description
|
|
|
|
| 56 |
### Training hyperparameters
|
| 57 |
|
| 58 |
The following hyperparameters were used during training:
|
| 59 |
+
- learning_rate: 0.002
|
| 60 |
- train_batch_size: 1
|
| 61 |
- eval_batch_size: 1
|
| 62 |
- seed: 42
|
|
|
|
| 70 |
|
| 71 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
| 72 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
| 73 |
+
| 22.0825 | 1.0 | 7 | 5.4800 | 0.0291 | 0.0123 | 0.0277 | 0.0269 | 31.0 | 0.0052 | 0.0246 | 0.3095 | 0.4603 | 556.0 | 1208.0 | 0.7219 | 0.8126 | 0.7641 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 74 |
+
| 5.6725 | 2.0 | 14 | 4.3850 | 0.1866 | 0.0375 | 0.1436 | 0.144 | 30.42 | 0.0 | 0.0501 | 0.8771 | 0.8841 | 1068.0 | 1208.0 | 0.8379 | 0.8489 | 0.8433 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 75 |
+
| 3.984 | 3.0 | 21 | 3.5872 | 0.2261 | 0.0642 | 0.1787 | 0.178 | 29.42 | 0.0293 | 0.0697 | 0.879 | 0.8858 | 1070.0 | 1208.0 | 0.8525 | 0.8587 | 0.8554 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 76 |
+
| 3.4293 | 4.0 | 28 | 3.1619 | 0.2643 | 0.0782 | 0.2047 | 0.2023 | 27.82 | 0.0343 | 0.0845 | 0.8451 | 0.856 | 1034.0 | 1208.0 | 0.8639 | 0.864 | 0.8639 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 77 |
+
| 2.9605 | 5.0 | 35 | 2.3087 | 0.2747 | 0.0788 | 0.2047 | 0.2033 | 28.2 | 0.0346 | 0.088 | 0.8518 | 0.8618 | 1041.0 | 1208.0 | 0.8645 | 0.8649 | 0.8646 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 78 |
+
| 2.4803 | 6.0 | 42 | 1.4041 | 0.3034 | 0.108 | 0.2334 | 0.2313 | 27.9 | 0.0481 | 0.1029 | 0.8584 | 0.8675 | 1048.0 | 1208.0 | 0.8735 | 0.8714 | 0.8724 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 79 |
+
| 1.4665 | 7.0 | 49 | 1.2358 | 0.3036 | 0.1109 | 0.2331 | 0.2316 | 28.1 | 0.047 | 0.1054 | 0.8706 | 0.8783 | 1061.0 | 1208.0 | 0.8761 | 0.8734 | 0.8747 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 80 |
+
| 1.2094 | 8.0 | 56 | 1.1991 | 0.3016 | 0.1141 | 0.2378 | 0.236 | 28.1 | 0.0469 | 0.1055 | 0.8631 | 0.8717 | 1053.0 | 1208.0 | 0.8777 | 0.8744 | 0.876 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 81 |
+
| 1.1101 | 9.0 | 63 | 1.1823 | 0.3041 | 0.1146 | 0.2466 | 0.2436 | 27.92 | 0.049 | 0.108 | 0.8365 | 0.8485 | 1025.0 | 1208.0 | 0.8793 | 0.8756 | 0.8774 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 82 |
+
| 1.2526 | 10.0 | 70 | 1.1754 | 0.3425 | 0.1338 | 0.2713 | 0.2702 | 27.94 | 0.0569 | 0.1207 | 0.8461 | 0.8568 | 1035.0 | 1208.0 | 0.8851 | 0.88 | 0.8825 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 83 |
+
| 1.0465 | 11.0 | 77 | 1.1696 | 0.3047 | 0.1143 | 0.2414 | 0.239 | 28.38 | 0.0527 | 0.1071 | 0.8659 | 0.8742 | 1056.0 | 1208.0 | 0.8787 | 0.875 | 0.8768 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 84 |
+
| 1.0647 | 12.0 | 84 | 1.1707 | 0.3154 | 0.1175 | 0.2473 | 0.2446 | 28.7 | 0.0525 | 0.109 | 0.8631 | 0.8717 | 1053.0 | 1208.0 | 0.8798 | 0.8756 | 0.8776 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 85 |
+
| 0.9992 | 13.0 | 91 | 1.1726 | 0.3177 | 0.1171 | 0.2468 | 0.2448 | 28.44 | 0.0538 | 0.11 | 0.8716 | 0.8791 | 1062.0 | 1208.0 | 0.879 | 0.8747 | 0.8768 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 86 |
+
| 0.9949 | 14.0 | 98 | 1.1712 | 0.3198 | 0.1172 | 0.2488 | 0.2464 | 28.98 | 0.0545 | 0.1103 | 0.8781 | 0.8849 | 1069.0 | 1208.0 | 0.8792 | 0.8768 | 0.8779 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 87 |
+
| 0.9798 | 15.0 | 105 | 1.1713 | 0.32 | 0.1176 | 0.2492 | 0.247 | 29.02 | 0.0549 | 0.1112 | 0.8809 | 0.8874 | 1072.0 | 1208.0 | 0.8793 | 0.8771 | 0.8781 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 88 |
|
| 89 |
|
| 90 |
### Framework versions
|
adapter_config.json
CHANGED
|
@@ -24,9 +24,9 @@
|
|
| 24 |
"rank_pattern": {},
|
| 25 |
"revision": null,
|
| 26 |
"target_modules": [
|
| 27 |
-
"q",
|
| 28 |
-
"v",
|
| 29 |
"o",
|
|
|
|
|
|
|
| 30 |
"k"
|
| 31 |
],
|
| 32 |
"task_type": "SEQ_2_SEQ_LM",
|
|
|
|
| 24 |
"rank_pattern": {},
|
| 25 |
"revision": null,
|
| 26 |
"target_modules": [
|
|
|
|
|
|
|
| 27 |
"o",
|
| 28 |
+
"v",
|
| 29 |
+
"q",
|
| 30 |
"k"
|
| 31 |
],
|
| 32 |
"task_type": "SEQ_2_SEQ_LM",
|
adapter_model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 7119264
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:46d2a723141ef72f620aed7f02e62132244e735b1f78ad7b55b5299d344c700a
|
| 3 |
size 7119264
|
runs/Jul29_11-23-34_tardis/events.out.tfevents.1753781016.tardis.17980.0
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:6e5e9d9088d8fbd29b2ba1e79b04aca1970b64b0b26050feab15079da2482664
|
| 3 |
+
size 25800
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 5905
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:5eb55aabc28197e14653437bd59dfdee16be0bf8b58a09630f45589591ea5296
|
| 3 |
size 5905
|