emilstabil
/

DanSumT5-largeV_26719

text2text-generation

Generated from Trainer

Model card Files Files and versions

DanSumT5-largeV_26719 / README.md

emilstabil's picture

End of training

4311133 about 2 years ago

|

history blame contribute delete

3.6 kB

	---
	license: apache-2.0
	base_model: Danish-summarisation/DanSumT5-large
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: DanSumT5-largeV_26719
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# DanSumT5-largeV_26719

	This model is a fine-tuned version of [Danish-summarisation/DanSumT5-large](https://huggingface.co/Danish-summarisation/DanSumT5-large) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.2976
	- Rouge1: 32.2799
	- Rouge2: 8.6728
	- Rougel: 18.8723
	- Rougelsum: 29.7852
	- Gen Len: 126.28

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 1
	- eval_batch_size: 1
	- seed: 42
	- gradient_accumulation_steps: 4
	- total_train_batch_size: 4
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 20

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:-------:\|:------:\|:-------:\|:---------:\|:-------:\|
	\| No log \| 1.0 \| 200 \| 2.5620 \| 31.648 \| 7.4069 \| 17.9711 \| 28.7951 \| 126.32 \|
	\| No log \| 2.0 \| 400 \| 2.4824 \| 31.8545 \| 8.094 \| 18.6072 \| 29.1646 \| 126.77 \|
	\| 2.7655 \| 3.0 \| 600 \| 2.4305 \| 32.1209 \| 8.5372 \| 18.744 \| 29.8788 \| 125.03 \|
	\| 2.7655 \| 4.0 \| 800 \| 2.3945 \| 31.8225 \| 8.739 \| 18.5656 \| 29.7696 \| 125.63 \|
	\| 2.4368 \| 5.0 \| 1000 \| 2.3685 \| 31.9779 \| 8.322 \| 18.766 \| 29.4834 \| 125.32 \|
	\| 2.4368 \| 6.0 \| 1200 \| 2.3522 \| 31.4296 \| 8.3578 \| 18.9591 \| 29.2204 \| 125.11 \|
	\| 2.4368 \| 7.0 \| 1400 \| 2.3364 \| 31.5372 \| 8.2997 \| 18.9915 \| 29.0248 \| 123.38 \|
	\| 2.2645 \| 8.0 \| 1600 \| 2.3250 \| 31.9344 \| 8.596 \| 19.0022 \| 29.4647 \| 125.18 \|
	\| 2.2645 \| 9.0 \| 1800 \| 2.3212 \| 31.515 \| 8.2166 \| 18.7697 \| 29.06 \| 126.01 \|
	\| 2.134 \| 10.0 \| 2000 \| 2.3117 \| 32.0188 \| 8.6934 \| 19.1051 \| 29.6682 \| 125.4 \|
	\| 2.134 \| 11.0 \| 2200 \| 2.3064 \| 31.8417 \| 8.7247 \| 18.9249 \| 29.5626 \| 125.86 \|
	\| 2.134 \| 12.0 \| 2400 \| 2.3062 \| 32.2302 \| 9.1081 \| 19.3087 \| 29.9162 \| 126.24 \|
	\| 2.0467 \| 13.0 \| 2600 \| 2.3032 \| 31.6755 \| 8.5093 \| 18.8486 \| 29.365 \| 125.02 \|
	\| 2.0467 \| 14.0 \| 2800 \| 2.3008 \| 31.9478 \| 8.8669 \| 18.9299 \| 29.504 \| 126.2 \|
	\| 1.9931 \| 15.0 \| 3000 \| 2.2980 \| 31.8088 \| 8.7506 \| 19.1051 \| 29.2949 \| 126.0 \|
	\| 1.9931 \| 16.0 \| 3200 \| 2.2982 \| 32.175 \| 8.8114 \| 18.7002 \| 29.6088 \| 126.0 \|
	\| 1.9931 \| 17.0 \| 3400 \| 2.2987 \| 32.0016 \| 8.7223 \| 18.7814 \| 29.6822 \| 125.66 \|
	\| 1.949 \| 18.0 \| 3600 \| 2.2974 \| 32.0515 \| 8.6141 \| 18.7833 \| 29.6024 \| 126.31 \|
	\| 1.949 \| 19.0 \| 3800 \| 2.2970 \| 32.0716 \| 8.6257 \| 18.7301 \| 29.4506 \| 126.15 \|
	\| 1.9257 \| 20.0 \| 4000 \| 2.2976 \| 32.2799 \| 8.6728 \| 18.8723 \| 29.7852 \| 126.28 \|


	### Framework versions

	- Transformers 4.32.1
	- Pytorch 2.1.0
	- Datasets 2.12.0
	- Tokenizers 0.13.3