bert-10 / README.md

hung200504

bert-cased

2b514b8 over 2 years ago

4.57 kB

	---
	license: cc-by-4.0
	base_model: deepset/bert-base-cased-squad2
	tags:
	- generated_from_trainer
	model-index:
	- name: bert-10
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-10

	This model is a fine-tuned version of [deepset/bert-base-cased-squad2](https://huggingface.co/deepset/bert-base-cased-squad2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 9.5797

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-07
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 10.8556 \| 0.05 \| 5 \| 12.3235 \|
	\| 10.8413 \| 0.09 \| 10 \| 12.2591 \|
	\| 11.0649 \| 0.14 \| 15 \| 12.1778 \|
	\| 11.6408 \| 0.18 \| 20 \| 12.0989 \|
	\| 11.3732 \| 0.23 \| 25 \| 12.0213 \|
	\| 10.5122 \| 0.28 \| 30 \| 11.9458 \|
	\| 10.6594 \| 0.32 \| 35 \| 11.8691 \|
	\| 10.745 \| 0.37 \| 40 \| 11.7928 \|
	\| 10.8256 \| 0.41 \| 45 \| 11.7163 \|
	\| 10.1627 \| 0.46 \| 50 \| 11.6430 \|
	\| 10.9907 \| 0.5 \| 55 \| 11.5703 \|
	\| 10.1394 \| 0.55 \| 60 \| 11.4997 \|
	\| 9.6059 \| 0.6 \| 65 \| 11.4287 \|
	\| 9.4972 \| 0.64 \| 70 \| 11.3621 \|
	\| 10.2252 \| 0.69 \| 75 \| 11.2949 \|
	\| 10.4887 \| 0.73 \| 80 \| 11.2288 \|
	\| 9.9616 \| 0.78 \| 85 \| 11.1638 \|
	\| 9.5775 \| 0.83 \| 90 \| 11.1003 \|
	\| 9.5971 \| 0.87 \| 95 \| 11.0381 \|
	\| 9.5745 \| 0.92 \| 100 \| 10.9773 \|
	\| 9.3218 \| 0.96 \| 105 \| 10.9178 \|
	\| 9.4906 \| 1.01 \| 110 \| 10.8597 \|
	\| 9.1168 \| 1.06 \| 115 \| 10.8030 \|
	\| 9.8009 \| 1.1 \| 120 \| 10.7465 \|
	\| 9.3632 \| 1.15 \| 125 \| 10.6915 \|
	\| 8.9858 \| 1.19 \| 130 \| 10.6399 \|
	\| 9.2904 \| 1.24 \| 135 \| 10.5874 \|
	\| 9.5344 \| 1.28 \| 140 \| 10.5370 \|
	\| 9.0034 \| 1.33 \| 145 \| 10.4871 \|
	\| 9.3024 \| 1.38 \| 150 \| 10.4384 \|
	\| 8.7905 \| 1.42 \| 155 \| 10.3920 \|
	\| 8.9329 \| 1.47 \| 160 \| 10.3465 \|
	\| 8.9834 \| 1.51 \| 165 \| 10.3027 \|
	\| 8.7307 \| 1.56 \| 170 \| 10.2607 \|
	\| 8.6729 \| 1.61 \| 175 \| 10.2200 \|
	\| 9.1849 \| 1.65 \| 180 \| 10.1794 \|
	\| 9.1618 \| 1.7 \| 185 \| 10.1400 \|
	\| 8.9048 \| 1.74 \| 190 \| 10.1023 \|
	\| 8.9427 \| 1.79 \| 195 \| 10.0655 \|
	\| 9.1052 \| 1.83 \| 200 \| 10.0294 \|
	\| 9.1123 \| 1.88 \| 205 \| 9.9938 \|
	\| 9.0476 \| 1.93 \| 210 \| 9.9604 \|
	\| 8.5532 \| 1.97 \| 215 \| 9.9285 \|
	\| 8.7871 \| 2.02 \| 220 \| 9.8977 \|
	\| 8.5984 \| 2.06 \| 225 \| 9.8690 \|
	\| 8.7009 \| 2.11 \| 230 \| 9.8414 \|
	\| 8.9376 \| 2.16 \| 235 \| 9.8146 \|
	\| 8.3535 \| 2.2 \| 240 \| 9.7906 \|
	\| 8.5805 \| 2.25 \| 245 \| 9.7675 \|
	\| 8.4641 \| 2.29 \| 250 \| 9.7463 \|
	\| 8.3975 \| 2.34 \| 255 \| 9.7263 \|
	\| 8.7698 \| 2.39 \| 260 \| 9.7070 \|
	\| 8.3541 \| 2.43 \| 265 \| 9.6901 \|
	\| 8.5443 \| 2.48 \| 270 \| 9.6743 \|
	\| 8.1539 \| 2.52 \| 275 \| 9.6595 \|
	\| 7.9856 \| 2.57 \| 280 \| 9.6459 \|
	\| 8.2532 \| 2.61 \| 285 \| 9.6333 \|
	\| 8.2116 \| 2.66 \| 290 \| 9.6221 \|
	\| 8.9557 \| 2.71 \| 295 \| 9.6119 \|
	\| 8.0754 \| 2.75 \| 300 \| 9.6032 \|
	\| 7.9534 \| 2.8 \| 305 \| 9.5956 \|
	\| 8.5578 \| 2.84 \| 310 \| 9.5899 \|
	\| 8.6403 \| 2.89 \| 315 \| 9.5848 \|
	\| 8.1103 \| 2.94 \| 320 \| 9.5817 \|
	\| 8.3785 \| 2.98 \| 325 \| 9.5797 \|


	### Framework versions

	- Transformers 4.34.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.5
	- Tokenizers 0.14.1

	---
	license: cc-by-4.0
	base_model: deepset/bert-base-cased-squad2
	tags:
	- generated_from_trainer
	model-index:
	- name: bert-10
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-10

	This model is a fine-tuned version of [deepset/bert-base-cased-squad2](https://huggingface.co/deepset/bert-base-cased-squad2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 9.5797

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-07
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 10.8556 \| 0.05 \| 5 \| 12.3235 \|
	\| 10.8413 \| 0.09 \| 10 \| 12.2591 \|
	\| 11.0649 \| 0.14 \| 15 \| 12.1778 \|
	\| 11.6408 \| 0.18 \| 20 \| 12.0989 \|
	\| 11.3732 \| 0.23 \| 25 \| 12.0213 \|
	\| 10.5122 \| 0.28 \| 30 \| 11.9458 \|
	\| 10.6594 \| 0.32 \| 35 \| 11.8691 \|
	\| 10.745 \| 0.37 \| 40 \| 11.7928 \|
	\| 10.8256 \| 0.41 \| 45 \| 11.7163 \|
	\| 10.1627 \| 0.46 \| 50 \| 11.6430 \|
	\| 10.9907 \| 0.5 \| 55 \| 11.5703 \|
	\| 10.1394 \| 0.55 \| 60 \| 11.4997 \|
	\| 9.6059 \| 0.6 \| 65 \| 11.4287 \|
	\| 9.4972 \| 0.64 \| 70 \| 11.3621 \|
	\| 10.2252 \| 0.69 \| 75 \| 11.2949 \|
	\| 10.4887 \| 0.73 \| 80 \| 11.2288 \|
	\| 9.9616 \| 0.78 \| 85 \| 11.1638 \|
	\| 9.5775 \| 0.83 \| 90 \| 11.1003 \|
	\| 9.5971 \| 0.87 \| 95 \| 11.0381 \|
	\| 9.5745 \| 0.92 \| 100 \| 10.9773 \|
	\| 9.3218 \| 0.96 \| 105 \| 10.9178 \|
	\| 9.4906 \| 1.01 \| 110 \| 10.8597 \|
	\| 9.1168 \| 1.06 \| 115 \| 10.8030 \|
	\| 9.8009 \| 1.1 \| 120 \| 10.7465 \|
	\| 9.3632 \| 1.15 \| 125 \| 10.6915 \|
	\| 8.9858 \| 1.19 \| 130 \| 10.6399 \|
	\| 9.2904 \| 1.24 \| 135 \| 10.5874 \|
	\| 9.5344 \| 1.28 \| 140 \| 10.5370 \|
	\| 9.0034 \| 1.33 \| 145 \| 10.4871 \|
	\| 9.3024 \| 1.38 \| 150 \| 10.4384 \|
	\| 8.7905 \| 1.42 \| 155 \| 10.3920 \|
	\| 8.9329 \| 1.47 \| 160 \| 10.3465 \|
	\| 8.9834 \| 1.51 \| 165 \| 10.3027 \|
	\| 8.7307 \| 1.56 \| 170 \| 10.2607 \|
	\| 8.6729 \| 1.61 \| 175 \| 10.2200 \|
	\| 9.1849 \| 1.65 \| 180 \| 10.1794 \|
	\| 9.1618 \| 1.7 \| 185 \| 10.1400 \|
	\| 8.9048 \| 1.74 \| 190 \| 10.1023 \|
	\| 8.9427 \| 1.79 \| 195 \| 10.0655 \|
	\| 9.1052 \| 1.83 \| 200 \| 10.0294 \|
	\| 9.1123 \| 1.88 \| 205 \| 9.9938 \|
	\| 9.0476 \| 1.93 \| 210 \| 9.9604 \|
	\| 8.5532 \| 1.97 \| 215 \| 9.9285 \|
	\| 8.7871 \| 2.02 \| 220 \| 9.8977 \|
	\| 8.5984 \| 2.06 \| 225 \| 9.8690 \|
	\| 8.7009 \| 2.11 \| 230 \| 9.8414 \|
	\| 8.9376 \| 2.16 \| 235 \| 9.8146 \|
	\| 8.3535 \| 2.2 \| 240 \| 9.7906 \|
	\| 8.5805 \| 2.25 \| 245 \| 9.7675 \|
	\| 8.4641 \| 2.29 \| 250 \| 9.7463 \|
	\| 8.3975 \| 2.34 \| 255 \| 9.7263 \|
	\| 8.7698 \| 2.39 \| 260 \| 9.7070 \|
	\| 8.3541 \| 2.43 \| 265 \| 9.6901 \|
	\| 8.5443 \| 2.48 \| 270 \| 9.6743 \|
	\| 8.1539 \| 2.52 \| 275 \| 9.6595 \|
	\| 7.9856 \| 2.57 \| 280 \| 9.6459 \|
	\| 8.2532 \| 2.61 \| 285 \| 9.6333 \|
	\| 8.2116 \| 2.66 \| 290 \| 9.6221 \|
	\| 8.9557 \| 2.71 \| 295 \| 9.6119 \|
	\| 8.0754 \| 2.75 \| 300 \| 9.6032 \|
	\| 7.9534 \| 2.8 \| 305 \| 9.5956 \|
	\| 8.5578 \| 2.84 \| 310 \| 9.5899 \|
	\| 8.6403 \| 2.89 \| 315 \| 9.5848 \|
	\| 8.1103 \| 2.94 \| 320 \| 9.5817 \|
	\| 8.3785 \| 2.98 \| 325 \| 9.5797 \|


	### Framework versions

	- Transformers 4.34.0
	- Pytorch 2.0.1+cu118
	- Datasets 2.14.5
	- Tokenizers 0.14.1