Instructions to use floflodebilbao/Lora_LED_sum_challenge with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use floflodebilbao/Lora_LED_sum_challenge with PEFT:
from peft import PeftModel from transformers import AutoModelForSeq2SeqLM base_model = AutoModelForSeq2SeqLM.from_pretrained("allenai/led-base-16384") model = PeftModel.from_pretrained(base_model, "floflodebilbao/Lora_LED_sum_challenge") - Notebooks
- Google Colab
- Kaggle
End of training
Browse files- README.md +25 -25
- adapter_config.json +3 -3
- adapter_model.safetensors +1 -1
- runs/Jul24_11-53-52_tardis/events.out.tfevents.1753350834.tardis.447846.0 +3 -0
- runs/Jul24_12-04-52_tardis/events.out.tfevents.1753351494.tardis.450378.0 +3 -0
- runs/Jul24_12-05-23_tardis/events.out.tfevents.1753351525.tardis.450578.0 +3 -0
- runs/Jul24_12-11-08_tardis/events.out.tfevents.1753351870.tardis.451748.0 +3 -0
- training_args.bin +1 -1
README.md
CHANGED
|
@@ -22,21 +22,21 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 22 |
|
| 23 |
This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on an unknown dataset.
|
| 24 |
It achieves the following results on the evaluation set:
|
| 25 |
-
- Loss: 4.
|
| 26 |
-
- Rouge1: 0.
|
| 27 |
-
- Rouge2: 0.
|
| 28 |
-
- Rougel: 0.
|
| 29 |
-
- Rougelsum: 0.
|
| 30 |
-
- Gen Len:
|
| 31 |
-
- Bleu: 0.
|
| 32 |
-
- Precisions: 0.
|
| 33 |
-
- Brevity Penalty: 0.
|
| 34 |
-
- Length Ratio: 0.
|
| 35 |
-
- Translation Length:
|
| 36 |
- Reference Length: 1208.0
|
| 37 |
-
- Precision: 0.
|
| 38 |
-
- Recall: 0.
|
| 39 |
-
- F1: 0.
|
| 40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
| 41 |
|
| 42 |
## Model description
|
|
@@ -57,7 +57,7 @@ More information needed
|
|
| 57 |
|
| 58 |
The following hyperparameters were used during training:
|
| 59 |
- learning_rate: 0.001
|
| 60 |
-
- train_batch_size:
|
| 61 |
- eval_batch_size: 8
|
| 62 |
- seed: 42
|
| 63 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
|
@@ -69,16 +69,16 @@ The following hyperparameters were used during training:
|
|
| 69 |
|
| 70 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
| 71 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
| 72 |
-
|
|
| 73 |
-
|
|
| 74 |
-
|
|
| 75 |
-
|
|
| 76 |
-
| 4.
|
| 77 |
-
|
|
| 78 |
-
|
|
| 79 |
-
|
|
| 80 |
-
| 3.
|
| 81 |
-
| 3.
|
| 82 |
|
| 83 |
|
| 84 |
### Framework versions
|
|
|
|
| 22 |
|
| 23 |
This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on an unknown dataset.
|
| 24 |
It achieves the following results on the evaluation set:
|
| 25 |
+
- Loss: 4.1201
|
| 26 |
+
- Rouge1: 0.2826
|
| 27 |
+
- Rouge2: 0.1016
|
| 28 |
+
- Rougel: 0.2235
|
| 29 |
+
- Rougelsum: 0.2227
|
| 30 |
+
- Gen Len: 27.48
|
| 31 |
+
- Bleu: 0.0515
|
| 32 |
+
- Precisions: 0.1044
|
| 33 |
+
- Brevity Penalty: 0.8659
|
| 34 |
+
- Length Ratio: 0.8742
|
| 35 |
+
- Translation Length: 1056.0
|
| 36 |
- Reference Length: 1208.0
|
| 37 |
+
- Precision: 0.8808
|
| 38 |
+
- Recall: 0.8739
|
| 39 |
+
- F1: 0.8773
|
| 40 |
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
|
| 41 |
|
| 42 |
## Model description
|
|
|
|
| 57 |
|
| 58 |
The following hyperparameters were used during training:
|
| 59 |
- learning_rate: 0.001
|
| 60 |
+
- train_batch_size: 8
|
| 61 |
- eval_batch_size: 8
|
| 62 |
- seed: 42
|
| 63 |
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
|
|
|
| 69 |
|
| 70 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
|
| 71 |
|:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|:------:|:----------:|:---------------:|:------------:|:------------------:|:----------------:|:---------:|:------:|:------:|:---------------------------------------------------------:|
|
| 72 |
+
| 8.708 | 1.0 | 13 | 6.7992 | 0.2058 | 0.0456 | 0.1594 | 0.159 | 31.68 | 0.0216 | 0.0515 | 1.0 | 1.0737 | 1297.0 | 1208.0 | 0.8535 | 0.8564 | 0.8549 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 73 |
+
| 5.8473 | 2.0 | 26 | 4.8979 | 0.2553 | 0.0817 | 0.1969 | 0.1972 | 27.54 | 0.035 | 0.0853 | 0.8901 | 0.8957 | 1082.0 | 1208.0 | 0.8761 | 0.8691 | 0.8725 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 74 |
+
| 4.6072 | 3.0 | 39 | 4.2460 | 0.269 | 0.0781 | 0.2078 | 0.2084 | 28.32 | 0.0414 | 0.0898 | 0.865 | 0.8733 | 1055.0 | 1208.0 | 0.8742 | 0.8722 | 0.8731 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 75 |
+
| 4.2016 | 4.0 | 52 | 4.1384 | 0.2709 | 0.0894 | 0.2139 | 0.2134 | 27.4 | 0.0495 | 0.0998 | 0.8753 | 0.8825 | 1066.0 | 1208.0 | 0.8792 | 0.8721 | 0.8756 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 76 |
+
| 4.0062 | 5.0 | 65 | 4.0907 | 0.2755 | 0.0825 | 0.2128 | 0.2125 | 28.64 | 0.0437 | 0.0921 | 0.901 | 0.9056 | 1094.0 | 1208.0 | 0.8733 | 0.8725 | 0.8729 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 77 |
+
| 3.892 | 6.0 | 78 | 4.0992 | 0.2806 | 0.0934 | 0.2199 | 0.2191 | 28.22 | 0.0388 | 0.0952 | 0.891 | 0.8965 | 1083.0 | 1208.0 | 0.8797 | 0.8754 | 0.8775 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 78 |
+
| 3.8119 | 7.0 | 91 | 4.0950 | 0.2985 | 0.0916 | 0.2268 | 0.2264 | 28.16 | 0.0284 | 0.0947 | 0.891 | 0.8965 | 1083.0 | 1208.0 | 0.8812 | 0.8763 | 0.8787 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 79 |
+
| 3.7427 | 8.0 | 104 | 4.1031 | 0.2942 | 0.1025 | 0.2356 | 0.2344 | 27.2 | 0.0526 | 0.1111 | 0.8394 | 0.851 | 1028.0 | 1208.0 | 0.8819 | 0.8758 | 0.8788 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 80 |
+
| 3.6902 | 9.0 | 117 | 4.1120 | 0.2981 | 0.1028 | 0.2323 | 0.232 | 28.08 | 0.0487 | 0.1036 | 0.8836 | 0.8899 | 1075.0 | 1208.0 | 0.8782 | 0.8755 | 0.8768 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 81 |
+
| 3.6548 | 10.0 | 130 | 4.1201 | 0.2826 | 0.1016 | 0.2235 | 0.2227 | 27.48 | 0.0515 | 0.1044 | 0.8659 | 0.8742 | 1056.0 | 1208.0 | 0.8808 | 0.8739 | 0.8773 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
|
| 82 |
|
| 83 |
|
| 84 |
### Framework versions
|
adapter_config.json
CHANGED
|
@@ -24,10 +24,10 @@
|
|
| 24 |
"rank_pattern": {},
|
| 25 |
"revision": null,
|
| 26 |
"target_modules": [
|
| 27 |
-
"
|
| 28 |
-
"q_proj",
|
| 29 |
"out_proj",
|
| 30 |
-
"
|
|
|
|
| 31 |
],
|
| 32 |
"task_type": "SEQ_2_SEQ_LM",
|
| 33 |
"trainable_token_indices": null,
|
|
|
|
| 24 |
"rank_pattern": {},
|
| 25 |
"revision": null,
|
| 26 |
"target_modules": [
|
| 27 |
+
"v_proj",
|
|
|
|
| 28 |
"out_proj",
|
| 29 |
+
"k_proj",
|
| 30 |
+
"q_proj"
|
| 31 |
],
|
| 32 |
"task_type": "SEQ_2_SEQ_LM",
|
| 33 |
"trainable_token_indices": null,
|
adapter_model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 2372496
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:534465ae1ca27c2a522d25577df098e585384cd46e787988aafc9c6fba8a771d
|
| 3 |
size 2372496
|
runs/Jul24_11-53-52_tardis/events.out.tfevents.1753350834.tardis.447846.0
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:3da05fff0636f87e4cb85c4734e52cca61e0a801a102e849c9331ff7512e0cf2
|
| 3 |
+
size 5600
|
runs/Jul24_12-04-52_tardis/events.out.tfevents.1753351494.tardis.450378.0
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:d5f532ef59eda31729207441b79674d782d07e8b6023405e2ec4100e7bad3903
|
| 3 |
+
size 5601
|
runs/Jul24_12-05-23_tardis/events.out.tfevents.1753351525.tardis.450578.0
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:232122de5a20d15dad1203571627e340a256bc1035f9c6a6f0e5fa2488b83b8f
|
| 3 |
+
size 7146
|
runs/Jul24_12-11-08_tardis/events.out.tfevents.1753351870.tardis.451748.0
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:bed1f73fd107931887cd4e183384dc3af43cd8d954fdf55534cc8af92823e763
|
| 3 |
+
size 19367
|
training_args.bin
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 5905
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:993a07156b4546a639efcb5894ce14068d90da7fce3e8d6da188663ca08c17d4
|
| 3 |
size 5905
|