raquelclemente
/

mt5-summarize-sum

text2text-generation

Generated from Trainer

Model card Files Files and versions

Metrics Training metrics Community

mt5-summarize-sum / README.md

raquelclemente's picture

update model card README.md

120cc82 almost 3 years ago

|

history blame contribute delete

3.07 kB

	---
	license: apache-2.0
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: mt5-summarize-sum
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# mt5-summarize-sum

	This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.3984
	- Rouge1: 0.5736
	- Rouge2: 0.3783
	- Rougel: 0.4855
	- Rougelsum: 0.4844

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 2
	- eval_batch_size: 2
	- seed: 42
	- gradient_accumulation_steps: 16
	- total_train_batch_size: 32
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 90
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:------:\|:------:\|:------:\|:---------:\|
	\| 13.8551 \| 0.16 \| 100 \| 5.4672 \| 0.2389 \| 0.0546 \| 0.2119 \| 0.2110 \|
	\| 1.0762 \| 0.33 \| 200 \| 0.5982 \| 0.3774 \| 0.2199 \| 0.3493 \| 0.3470 \|
	\| 0.8077 \| 0.49 \| 300 \| 0.4999 \| 0.4929 \| 0.3195 \| 0.4349 \| 0.4312 \|
	\| 0.7772 \| 0.65 \| 400 \| 0.4652 \| 0.4715 \| 0.3296 \| 0.4431 \| 0.4409 \|
	\| 0.7771 \| 0.82 \| 500 \| 0.4402 \| 0.4881 \| 0.3356 \| 0.4433 \| 0.4412 \|
	\| 0.713 \| 0.98 \| 600 \| 0.4500 \| 0.4990 \| 0.3291 \| 0.4550 \| 0.4525 \|
	\| 0.65 \| 1.15 \| 700 \| 0.4335 \| 0.5522 \| 0.3633 \| 0.4930 \| 0.4909 \|
	\| 0.7035 \| 1.31 \| 800 \| 0.4278 \| 0.5227 \| 0.3470 \| 0.4781 \| 0.4772 \|
	\| 0.6818 \| 1.47 \| 900 \| 0.4202 \| 0.5325 \| 0.3585 \| 0.4759 \| 0.4744 \|
	\| 0.6643 \| 1.64 \| 1000 \| 0.4113 \| 0.5326 \| 0.3486 \| 0.4678 \| 0.4641 \|
	\| 0.6007 \| 1.8 \| 1100 \| 0.4122 \| 0.5152 \| 0.3260 \| 0.4572 \| 0.4547 \|
	\| 0.5866 \| 1.96 \| 1200 \| 0.4158 \| 0.5538 \| 0.3680 \| 0.4910 \| 0.4903 \|
	\| 0.5563 \| 2.13 \| 1300 \| 0.4051 \| 0.5433 \| 0.3371 \| 0.4685 \| 0.4672 \|
	\| 0.5727 \| 2.29 \| 1400 \| 0.4089 \| 0.5447 \| 0.3619 \| 0.4711 \| 0.4695 \|
	\| 0.5859 \| 2.45 \| 1500 \| 0.4033 \| 0.5464 \| 0.3411 \| 0.4688 \| 0.4662 \|
	\| 0.5783 \| 2.62 \| 1600 \| 0.3997 \| 0.5667 \| 0.3595 \| 0.4825 \| 0.4787 \|
	\| 0.5673 \| 2.78 \| 1700 \| 0.3992 \| 0.5759 \| 0.3882 \| 0.4911 \| 0.4891 \|
	\| 0.57 \| 2.95 \| 1800 \| 0.3984 \| 0.5736 \| 0.3783 \| 0.4855 \| 0.4844 \|


	### Framework versions

	- Transformers 4.27.4
	- Pytorch 1.13.0
	- Datasets 2.1.0
	- Tokenizers 0.13.2