End of training
Browse files- README.md +25 -25
- adapter_config.json +3 -3
- adapter_model.safetensors +1 -1
- runs/Jul25_10-38-21_tardis/events.out.tfevents.1753432702.tardis.586390.0 +3 -0
- training_args.bin +1 -1
README.md
CHANGED
|
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 22 |
|
| 23 |
This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on an unknown dataset.
|
| 24 |
It achieves the following results on the evaluation set:
|
| 25 |
-
- Loss:
|
| 26 |
-
- Rouge1: 0.
|
| 27 |
-
- Rouge2: 0.
|
| 28 |
-
- Rougel: 0.
|
| 29 |
-
- Rougelsum: 0.
|
| 30 |
-
- Gen Len:
|
| 31 |
-
- Bleu: 0.
|
| 32 |
-
- Precisions: 0.
|
| 33 |
-
- Brevity Penalty:
|
| 34 |
-
- Length Ratio:
|
| 35 |
-
- Translation Length:
|
| 36 |
- Reference Length: 1221.0
|
| 37 |
-
- Precision: 0.
|
| 38 |
-
- Recall: 0.
|
| 39 |
-
- F1: 0.
|
| 40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
| 41 |
|
| 42 |
## Model description
|
|
@@ -56,7 +56,7 @@ More information needed
|
|
| 56 |
### Training hyperparameters
|
| 57 |
|
| 58 |
The following hyperparameters were used during training:
|
| 59 |
-
- learning_rate:
|
| 60 |
- train_batch_size: 8
|
| 61 |
- eval_batch_size: 8
|
| 62 |
- seed: 42
|
|
@@ -69,16 +69,16 @@ The following hyperparameters were used during training:
|
|
| 69 |
|
| 70 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
| 71 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
| 72 |
-
|
|
| 73 |
-
|
|
| 74 |
-
|
|
| 75 |
-
|
|
| 76 |
-
|
|
| 77 |
-
|
|
| 78 |
-
|
|
| 79 |
-
|
|
| 80 |
-
|
|
| 81 |
-
|
|
| 82 |
|
| 83 |
|
| 84 |
### Framework versions
|
|
|
|
| 22 |
|
| 23 |
This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on an unknown dataset.
|
| 24 |
It achieves the following results on the evaluation set:
|
| 25 |
+
- Loss: 3.5772
|
| 26 |
+
- Rouge1: 0.477
|
| 27 |
+
- Rouge2: 0.2582
|
| 28 |
+
- Rougel: 0.4063
|
| 29 |
+
- Rougelsum: 0.4058
|
| 30 |
+
- Gen Len: 29.72
|
| 31 |
+
- Bleu: 0.1684
|
| 32 |
+
- Precisions: 0.2244
|
| 33 |
+
- Brevity Penalty: 0.9147
|
| 34 |
+
- Length Ratio: 0.9181
|
| 35 |
+
- Translation Length: 1121.0
|
| 36 |
- Reference Length: 1221.0
|
| 37 |
+
- Precision: 0.906
|
| 38 |
+
- Recall: 0.9034
|
| 39 |
+
- F1: 0.9046
|
| 40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
| 41 |
|
| 42 |
## Model description
|
|
|
|
| 56 |
### Training hyperparameters
|
| 57 |
|
| 58 |
The following hyperparameters were used during training:
|
| 59 |
+
- learning_rate: 0.001
|
| 60 |
- train_batch_size: 8
|
| 61 |
- eval_batch_size: 8
|
| 62 |
- seed: 42
|
|
|
|
| 69 |
|
| 70 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
| 71 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
| 72 |
+
| 8.2369 | 1.0 | 13 | 6.2010 | 0.3878 | 0.1759 | 0.3252 | 0.3253 | 31.78 | 0.111 | 0.1493 | 1.0 | 1.0737 | 1311.0 | 1221.0 | 0.8831 | 0.8831 | 0.883 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 73 |
+
| 5.5627 | 2.0 | 26 | 5.2052 | 0.4251 | 0.2192 | 0.3738 | 0.3736 | 26.24 | 0.1227 | 0.2064 | 0.772 | 0.7944 | 970.0 | 1221.0 | 0.9067 | 0.8935 | 0.8999 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 74 |
+
| 4.4273 | 3.0 | 39 | 3.7823 | 0.4604 | 0.2497 | 0.3967 | 0.3971 | 27.26 | 0.1501 | 0.2249 | 0.8192 | 0.8337 | 1018.0 | 1221.0 | 0.9063 | 0.8994 | 0.9027 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 75 |
+
| 3.9367 | 4.0 | 52 | 3.6272 | 0.4554 | 0.2512 | 0.3954 | 0.3955 | 26.46 | 0.1504 | 0.2382 | 0.77 | 0.7928 | 968.0 | 1221.0 | 0.908 | 0.8965 | 0.9021 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 76 |
+
| 3.7676 | 5.0 | 65 | 3.5810 | 0.4683 | 0.2639 | 0.4067 | 0.4087 | 26.1 | 0.1551 | 0.249 | 0.7518 | 0.7781 | 950.0 | 1221.0 | 0.9154 | 0.9021 | 0.9086 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 77 |
+
| 3.6775 | 6.0 | 78 | 3.5931 | 0.4613 | 0.2477 | 0.3953 | 0.3952 | 29.62 | 0.1551 | 0.2141 | 0.8985 | 0.9034 | 1103.0 | 1221.0 | 0.9042 | 0.9016 | 0.9028 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 78 |
+
| 3.5802 | 7.0 | 91 | 3.5738 | 0.4599 | 0.2447 | 0.3889 | 0.3901 | 29.5 | 0.156 | 0.2092 | 0.92 | 0.923 | 1127.0 | 1221.0 | 0.904 | 0.9005 | 0.9022 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 79 |
+
| 3.5271 | 8.0 | 104 | 3.5739 | 0.4665 | 0.2559 | 0.3987 | 0.3986 | 28.38 | 0.1583 | 0.2278 | 0.8553 | 0.8649 | 1056.0 | 1221.0 | 0.9089 | 0.9027 | 0.9057 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 80 |
+
| 3.4856 | 9.0 | 117 | 3.5726 | 0.4653 | 0.2426 | 0.4004 | 0.3997 | 30.3 | 0.1569 | 0.2081 | 0.9401 | 0.9419 | 1150.0 | 1221.0 | 0.9012 | 0.9009 | 0.901 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 81 |
+
| 3.419 | 10.0 | 130 | 3.5772 | 0.477 | 0.2582 | 0.4063 | 0.4058 | 29.72 | 0.1684 | 0.2244 | 0.9147 | 0.9181 | 1121.0 | 1221.0 | 0.906 | 0.9034 | 0.9046 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 82 |
|
| 83 |
|
| 84 |
### Framework versions
|
adapter_config.json
CHANGED
|
@@ -24,10 +24,10 @@
|
|
| 24 |
"rank_pattern": {},
|
| 25 |
"revision": null,
|
| 26 |
"target_modules": [
|
| 27 |
-
"v_proj",
|
| 28 |
-
"q_proj",
|
| 29 |
"k_proj",
|
| 30 |
-
"
|
|
|
|
|
|
|
| 31 |
],
|
| 32 |
"task_type": "SEQ_2_SEQ_LM",
|
| 33 |
"trainable_token_indices": null,
|
|
|
|
| 24 |
"rank_pattern": {},
|
| 25 |
"revision": null,
|
| 26 |
"target_modules": [
|
|
|
|
|
|
|
| 27 |
"k_proj",
|
| 28 |
+
"q_proj",
|
| 29 |
+
"out_proj",
|
| 30 |
+
"v_proj"
|
| 31 |
],
|
| 32 |
"task_type": "SEQ_2_SEQ_LM",
|
| 33 |
"trainable_token_indices": null,
|
adapter_model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 2372496
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b4d010e93d82ab3964b3dc0ea462df6135d89950bb3bb20bd85ff5462f3339ec
|
| 3 |
size 2372496
|
runs/Jul25_10-38-21_tardis/events.out.tfevents.1753432702.tardis.586390.0
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:b53a1bfd3c574e39699a0056f94ff8bd626bafcfb82ce5afa21deb21b420ce3d
|
| 3 |
+
size 19364
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 5905
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:8d432f14048b72dc8fbc931660b79bd081b082c0ae64de9c7ef7ce3f9092c9f5
|
| 3 |
size 5905
|