floflodebilbao
/

LoRA_LED_all_aspects

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

LoRA_LED_all_aspects / README.md

floflodebilbao's picture

End of training

204c762 verified 5 months ago

|

history blame contribute delete

3.7 kB

	---
	library_name: peft
	license: apache-2.0
	base_model: allenai/led-base-16384
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	- bleu
	- precision
	- recall
	- f1
	model-index:
	- name: LoRA_LED_all_aspects
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# LoRA_LED_all_aspects

	This model is a fine-tuned version of [allenai/led-base-16384](https://huggingface.co/allenai/led-base-16384) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.2585
	- Rouge1: 0.2961
	- Rouge2: 0.1042
	- Rougel: 0.234
	- Rougelsum: 0.2333
	- Gen Len: 29.3933
	- Bleu: 0.0577
	- Precisions: 0.1023
	- Brevity Penalty: 0.9031
	- Length Ratio: 0.9075
	- Translation Length: 3268.0
	- Reference Length: 3601.0
	- Precision: 0.8752
	- Recall: 0.8737
	- F1: 0.8744
	- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.002
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
	- lr_scheduler_type: linear
	- num_epochs: 5
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \| Bleu \| Precisions \| Brevity Penalty \| Length Ratio \| Translation Length \| Reference Length \| Precision \| Recall \| F1 \| Hashcode \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|:-------:\|:------:\|:----------:\|:---------------:\|:------------:\|:------------------:\|:----------------:\|:---------:\|:------:\|:------:\|:---------------------------------------------------------:\|
	\| 5.0003 \| 1.0 \| 38 \| 3.3632 \| 0.2751 \| 0.1031 \| 0.2271 \| 0.2265 \| 26.0267 \| 0.0601 \| 0.113 \| 0.7896 \| 0.8089 \| 2913.0 \| 3601.0 \| 0.8795 \| 0.8705 \| 0.8749 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 3.4616 \| 2.0 \| 76 \| 3.2445 \| 0.3023 \| 0.104 \| 0.237 \| 0.2372 \| 30.3467 \| 0.065 \| 0.1086 \| 0.9019 \| 0.9064 \| 3264.0 \| 3601.0 \| 0.872 \| 0.874 \| 0.8729 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 3.1933 \| 3.0 \| 114 \| 3.2125 \| 0.282 \| 0.0998 \| 0.2276 \| 0.2264 \| 29.2733 \| 0.0576 \| 0.1018 \| 0.8859 \| 0.892 \| 3212.0 \| 3601.0 \| 0.8732 \| 0.8729 \| 0.873 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 3.0156 \| 4.0 \| 152 \| 3.2279 \| 0.2848 \| 0.1012 \| 0.2294 \| 0.2287 \| 28.92 \| 0.0611 \| 0.1059 \| 0.8862 \| 0.8923 \| 3213.0 \| 3601.0 \| 0.8769 \| 0.8742 \| 0.8755 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|
	\| 2.874 \| 5.0 \| 190 \| 3.2585 \| 0.2961 \| 0.1042 \| 0.234 \| 0.2333 \| 29.3933 \| 0.0577 \| 0.1023 \| 0.9031 \| 0.9075 \| 3268.0 \| 3601.0 \| 0.8752 \| 0.8737 \| 0.8744 \| roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) \|


	### Framework versions

	- PEFT 0.15.2
	- Transformers 4.53.1
	- Pytorch 2.7.0+cu126
	- Datasets 3.6.0
	- Tokenizers 0.21.1