End of training

fb0e53d verified 5 months ago

5.15 kB

	---
	library_name: peft
	license: apache-2.0
	base_model: allenai/led-base-16384
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	- bleu
	- precision
	- recall
	- f1
	model-index:
	- name: Lora_LED_sum_approach
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Lora_LED_sum_approach

	This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.5646
	- Rouge1: 0.4521
	- Rouge2: 0.2422
	- Rougel: 0.3904
	- Rougelsum: 0.3905
	- Gen Len: 29.4
	- Bleu: 0.1533
	- Precisions: 0.2152
	- Brevity Penalty: 0.8831
	- Length Ratio: 0.8894
	- Translation Length: 1086.0
	- Reference Length: 1221.0
	- Precision: 0.9043
	- Recall: 0.9002
	- F1: 0.9021
	- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.001
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 16
	- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \| Bleu \| Precisions \| Brevity Penalty \| Length Ratio \| Translation Length \| Reference Length \| Precision \| Recall \| F1 \| Hashcode \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|:-------:\|:------:\|:----------:\|:---------------:\|:------------:\|:------------------:\|:----------------:\|:---------:\|:------:\|:------:\|:---------------------------------------------------------:\|
	\| 8.0757 \| 1.0 \| 7 \| 7.6798 \| 0.3128 \| 0.1085 \| 0.253 \| 0.2533 \| 32.0 \| 0.0733 \| 0.1062 \| 1.0 \| 1.0663 \| 1302.0 \| 1221.0 \| 0.8685 \| 0.8728 \| 0.8706 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 6.5609 \| 2.0 \| 14 \| 5.7642 \| 0.4165 \| 0.2088 \| 0.3627 \| 0.3626 \| 30.64 \| 0.1358 \| 0.1742 \| 1.0 \| 1.036 \| 1265.0 \| 1221.0 \| 0.8922 \| 0.8861 \| 0.889 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 4.9145 \| 3.0 \| 21 \| 4.4340 \| 0.4234 \| 0.2265 \| 0.3669 \| 0.3685 \| 25.84 \| 0.1246 \| 0.2092 \| 0.765 \| 0.7887 \| 963.0 \| 1221.0 \| 0.9057 \| 0.894 \| 0.8996 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 4.0682 \| 4.0 \| 28 \| 3.9241 \| 0.4454 \| 0.2452 \| 0.3952 \| 0.3971 \| 27.26 \| 0.1446 \| 0.2209 \| 0.8115 \| 0.8272 \| 1010.0 \| 1221.0 \| 0.9059 \| 0.8983 \| 0.9019 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 3.6834 \| 5.0 \| 35 \| 3.7361 \| 0.4521 \| 0.237 \| 0.3828 \| 0.3837 \| 27.58 \| 0.1433 \| 0.2137 \| 0.8327 \| 0.8452 \| 1032.0 \| 1221.0 \| 0.9031 \| 0.8973 \| 0.9 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 3.5042 \| 6.0 \| 42 \| 3.6285 \| 0.4567 \| 0.247 \| 0.3901 \| 0.3908 \| 27.86 \| 0.1451 \| 0.2184 \| 0.8336 \| 0.846 \| 1033.0 \| 1221.0 \| 0.9067 \| 0.9003 \| 0.9033 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 3.4173 \| 7.0 \| 49 \| 3.5881 \| 0.4458 \| 0.2389 \| 0.3839 \| 0.3852 \| 27.16 \| 0.1439 \| 0.2226 \| 0.7929 \| 0.8116 \| 991.0 \| 1221.0 \| 0.9056 \| 0.8973 \| 0.9013 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 3.3572 \| 8.0 \| 56 \| 3.5698 \| 0.4514 \| 0.2331 \| 0.3836 \| 0.3862 \| 29.12 \| 0.147 \| 0.2081 \| 0.884 \| 0.8903 \| 1087.0 \| 1221.0 \| 0.9026 \| 0.8994 \| 0.9009 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 3.3165 \| 9.0 \| 63 \| 3.5700 \| 0.4592 \| 0.2422 \| 0.3954 \| 0.3957 \| 29.28 \| 0.1502 \| 0.2113 \| 0.8922 \| 0.8976 \| 1096.0 \| 1221.0 \| 0.9056 \| 0.9012 \| 0.9033 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 3.3094 \| 10.0 \| 70 \| 3.5646 \| 0.4521 \| 0.2422 \| 0.3904 \| 0.3905 \| 29.4 \| 0.1533 \| 0.2152 \| 0.8831 \| 0.8894 \| 1086.0 \| 1221.0 \| 0.9043 \| 0.9002 \| 0.9021 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|


	### Framework versions

	- PEFT 0.15.2
	- Transformers 4.53.1
	- Pytorch 2.7.0+cu126
	- Datasets 3.6.0
	- Tokenizers 0.21.1