bert-concat-2 / README.md

update model card README.md

b9981fa over 2 years ago

3.22 kB

	---
	tags:
	- generated_from_trainer
	datasets:
	- generator
	model-index:
	- name: bert-concat-2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-concat-2

	This model is a fine-tuned version of [](https://huggingface.co/) on the generator dataset.
	It achieves the following results on the evaluation set:
	- Loss: 5.7060

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 1000
	- num_epochs: 20
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 6.6866 \| 0.52 \| 1000 \| 6.2709 \|
	\| 6.2315 \| 1.04 \| 2000 \| 6.2177 \|
	\| 6.1818 \| 1.56 \| 3000 \| 6.1895 \|
	\| 6.1511 \| 2.08 \| 4000 \| 6.1559 \|
	\| 6.0984 \| 2.6 \| 5000 \| 6.1185 \|
	\| 6.0611 \| 3.12 \| 6000 \| 6.0668 \|
	\| 6.0114 \| 3.65 \| 7000 \| 6.0361 \|
	\| 5.9679 \| 4.17 \| 8000 \| 6.0160 \|
	\| 5.9272 \| 4.69 \| 9000 \| 5.9731 \|
	\| 5.8904 \| 5.21 \| 10000 \| 5.9424 \|
	\| 5.8557 \| 5.73 \| 11000 \| 5.9190 \|
	\| 5.8237 \| 6.25 \| 12000 \| 5.9002 \|
	\| 5.8008 \| 6.77 \| 13000 \| 5.8787 \|
	\| 5.7785 \| 7.29 \| 14000 \| 5.8644 \|
	\| 5.7569 \| 7.81 \| 15000 \| 5.8534 \|
	\| 5.7305 \| 8.33 \| 16000 \| 5.8429 \|
	\| 5.7187 \| 8.85 \| 17000 \| 5.8283 \|
	\| 5.699 \| 9.38 \| 18000 \| 5.8124 \|
	\| 5.6737 \| 9.9 \| 19000 \| 5.8055 \|
	\| 5.648 \| 10.42 \| 20000 \| 5.7945 \|
	\| 5.641 \| 10.94 \| 21000 \| 5.7869 \|
	\| 5.613 \| 11.46 \| 22000 \| 5.7700 \|
	\| 5.6078 \| 11.98 \| 23000 \| 5.7659 \|
	\| 5.5759 \| 12.5 \| 24000 \| 5.7555 \|
	\| 5.5682 \| 13.02 \| 25000 \| 5.7522 \|
	\| 5.5461 \| 13.54 \| 26000 \| 5.7397 \|
	\| 5.5414 \| 14.06 \| 27000 \| 5.7349 \|
	\| 5.5195 \| 14.58 \| 28000 \| 5.7310 \|
	\| 5.5081 \| 15.1 \| 29000 \| 5.7214 \|
	\| 5.4922 \| 15.62 \| 30000 \| 5.7188 \|
	\| 5.4858 \| 16.15 \| 31000 \| 5.7127 \|
	\| 5.4786 \| 16.67 \| 32000 \| 5.7092 \|
	\| 5.4685 \| 17.19 \| 33000 \| 5.7075 \|
	\| 5.4571 \| 17.71 \| 34000 \| 5.7060 \|
	\| 5.4592 \| 18.23 \| 35000 \| 5.7018 \|
	\| 5.4555 \| 18.75 \| 36000 \| 5.7043 \|
	\| 5.4512 \| 19.27 \| 37000 \| 5.7028 \|
	\| 5.4522 \| 19.79 \| 38000 \| 5.7060 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.11.0+cu113
	- Datasets 2.13.0
	- Tokenizers 0.13.3

	---
	tags:
	- generated_from_trainer
	datasets:
	- generator
	model-index:
	- name: bert-concat-2
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-concat-2

	This model is a fine-tuned version of [](https://huggingface.co/) on the generator dataset.
	It achieves the following results on the evaluation set:
	- Loss: 5.7060

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 0.0005
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- lr_scheduler_warmup_steps: 1000
	- num_epochs: 20
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:-----:\|:---------------:\|
	\| 6.6866 \| 0.52 \| 1000 \| 6.2709 \|
	\| 6.2315 \| 1.04 \| 2000 \| 6.2177 \|
	\| 6.1818 \| 1.56 \| 3000 \| 6.1895 \|
	\| 6.1511 \| 2.08 \| 4000 \| 6.1559 \|
	\| 6.0984 \| 2.6 \| 5000 \| 6.1185 \|
	\| 6.0611 \| 3.12 \| 6000 \| 6.0668 \|
	\| 6.0114 \| 3.65 \| 7000 \| 6.0361 \|
	\| 5.9679 \| 4.17 \| 8000 \| 6.0160 \|
	\| 5.9272 \| 4.69 \| 9000 \| 5.9731 \|
	\| 5.8904 \| 5.21 \| 10000 \| 5.9424 \|
	\| 5.8557 \| 5.73 \| 11000 \| 5.9190 \|
	\| 5.8237 \| 6.25 \| 12000 \| 5.9002 \|
	\| 5.8008 \| 6.77 \| 13000 \| 5.8787 \|
	\| 5.7785 \| 7.29 \| 14000 \| 5.8644 \|
	\| 5.7569 \| 7.81 \| 15000 \| 5.8534 \|
	\| 5.7305 \| 8.33 \| 16000 \| 5.8429 \|
	\| 5.7187 \| 8.85 \| 17000 \| 5.8283 \|
	\| 5.699 \| 9.38 \| 18000 \| 5.8124 \|
	\| 5.6737 \| 9.9 \| 19000 \| 5.8055 \|
	\| 5.648 \| 10.42 \| 20000 \| 5.7945 \|
	\| 5.641 \| 10.94 \| 21000 \| 5.7869 \|
	\| 5.613 \| 11.46 \| 22000 \| 5.7700 \|
	\| 5.6078 \| 11.98 \| 23000 \| 5.7659 \|
	\| 5.5759 \| 12.5 \| 24000 \| 5.7555 \|
	\| 5.5682 \| 13.02 \| 25000 \| 5.7522 \|
	\| 5.5461 \| 13.54 \| 26000 \| 5.7397 \|
	\| 5.5414 \| 14.06 \| 27000 \| 5.7349 \|
	\| 5.5195 \| 14.58 \| 28000 \| 5.7310 \|
	\| 5.5081 \| 15.1 \| 29000 \| 5.7214 \|
	\| 5.4922 \| 15.62 \| 30000 \| 5.7188 \|
	\| 5.4858 \| 16.15 \| 31000 \| 5.7127 \|
	\| 5.4786 \| 16.67 \| 32000 \| 5.7092 \|
	\| 5.4685 \| 17.19 \| 33000 \| 5.7075 \|
	\| 5.4571 \| 17.71 \| 34000 \| 5.7060 \|
	\| 5.4592 \| 18.23 \| 35000 \| 5.7018 \|
	\| 5.4555 \| 18.75 \| 36000 \| 5.7043 \|
	\| 5.4512 \| 19.27 \| 37000 \| 5.7028 \|
	\| 5.4522 \| 19.79 \| 38000 \| 5.7060 \|


	### Framework versions

	- Transformers 4.26.1
	- Pytorch 1.11.0+cu113
	- Datasets 2.13.0
	- Tokenizers 0.13.3