ChayanM
/

Swin-Bert_Mimic

vision-encoder-decoder

Generated from Trainer

Model card Files Files and versions

Swin-Bert_Mimic / README.md

ChayanM's picture

Model save

cc9c02f verified almost 2 years ago

|

history blame contribute delete

3.47 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- rouge
	model-index:
	- name: Swin-Bert_Mimic
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# Swin-Bert_Mimic

	This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.1025
	- Rouge1: 35.8104
	- Rouge2: 22.5915
	- Rougel: 34.3056
	- Rougelsum: 35.1416
	- Gen Len: 21.289

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 20

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Rouge1 \| Rouge2 \| Rougel \| Rougelsum \| Gen Len \|
	\|:-------------:\|:-----:\|:------:\|:---------------:\|:-------:\|:-------:\|:-------:\|:---------:\|:-------:\|
	\| 0.0677 \| 1.0 \| 7500 \| 0.0742 \| 34.0952 \| 25.4639 \| 34.0546 \| 34.0407 \| 14.412 \|
	\| 0.0621 \| 2.0 \| 15000 \| 0.0686 \| 37.767 \| 26.9356 \| 37.0596 \| 37.4647 \| 18.921 \|
	\| 0.0595 \| 3.0 \| 22500 \| 0.0670 \| 38.07 \| 26.9203 \| 37.1384 \| 37.7633 \| 22.422 \|
	\| 0.0536 \| 4.0 \| 30000 \| 0.0655 \| 38.064 \| 27.0799 \| 37.3483 \| 37.7981 \| 18.476 \|
	\| 0.0484 \| 5.0 \| 37500 \| 0.0655 \| 38.8419 \| 27.551 \| 37.992 \| 38.573 \| 19.552 \|
	\| 0.0436 \| 6.0 \| 45000 \| 0.0672 \| 39.2556 \| 27.3445 \| 38.1583 \| 38.9199 \| 19.699 \|
	\| 0.0394 \| 7.0 \| 52500 \| 0.0680 \| 38.6881 \| 27.1077 \| 37.6518 \| 38.3678 \| 19.322 \|
	\| 0.0355 \| 8.0 \| 60000 \| 0.0697 \| 39.2775 \| 27.1638 \| 38.1169 \| 38.786 \| 20.125 \|
	\| 0.0318 \| 9.0 \| 67500 \| 0.0719 \| 38.8973 \| 27.0819 \| 37.8138 \| 38.4725 \| 20.237 \|
	\| 0.0265 \| 10.0 \| 75000 \| 0.0746 \| 38.2854 \| 26.3015 \| 37.0627 \| 37.8955 \| 20.799 \|
	\| 0.0241 \| 11.0 \| 82500 \| 0.0769 \| 37.7814 \| 25.9821 \| 36.6626 \| 37.3682 \| 20.437 \|
	\| 0.0204 \| 12.0 \| 90000 \| 0.0810 \| 37.7945 \| 26.012 \| 36.5089 \| 37.3188 \| 20.945 \|
	\| 0.0172 \| 13.0 \| 97500 \| 0.0846 \| 37.5296 \| 25.3082 \| 36.2752 \| 36.9433 \| 20.397 \|
	\| 0.0147 \| 14.0 \| 105000 \| 0.0876 \| 36.6675 \| 24.5001 \| 35.264 \| 36.034 \| 22.044 \|
	\| 0.012 \| 15.0 \| 112500 \| 0.0907 \| 35.8928 \| 23.4706 \| 34.3812 \| 35.2234 \| 21.344 \|
	\| 0.0103 \| 16.0 \| 120000 \| 0.0947 \| 35.6648 \| 22.8131 \| 34.1013 \| 35.0637 \| 22.095 \|
	\| 0.0084 \| 17.0 \| 127500 \| 0.0971 \| 35.7702 \| 22.9984 \| 34.2882 \| 35.1362 \| 21.501 \|
	\| 0.0068 \| 18.0 \| 135000 \| 0.0996 \| 35.4212 \| 22.3513 \| 33.9646 \| 34.8255 \| 22.152 \|
	\| 0.0058 \| 19.0 \| 142500 \| 0.1019 \| 35.9704 \| 23.1195 \| 34.4672 \| 35.3553 \| 21.404 \|
	\| 0.0048 \| 20.0 \| 150000 \| 0.1025 \| 35.8104 \| 22.5915 \| 34.3056 \| 35.1416 \| 21.289 \|


	### Framework versions

	- Transformers 4.37.1
	- Pytorch 1.13.1+cu117
	- Datasets 2.15.0
	- Tokenizers 0.15.1