hung200504

bert-squad

b8968a7 over 2 years ago

4.85 kB

	---
	license: mit
	base_model: microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext
	tags:
	- generated_from_trainer
	datasets:
	- covid_qa_deepset
	model-index:
	- name: bert-covid
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-covid

	This model is a fine-tuned version of [microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext) on the covid_qa_deepset dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6900

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 5.474 \| 0.04 \| 5 \| 4.3730 \|
	\| 3.9933 \| 0.09 \| 10 \| 3.2783 \|
	\| 3.0206 \| 0.13 \| 15 \| 2.0289 \|
	\| 1.9741 \| 0.18 \| 20 \| 1.3879 \|
	\| 1.4351 \| 0.22 \| 25 \| 1.1733 \|
	\| 1.5916 \| 0.26 \| 30 \| 1.1623 \|
	\| 0.5383 \| 0.31 \| 35 \| 1.1952 \|
	\| 0.7776 \| 0.35 \| 40 \| 1.1920 \|
	\| 1.1785 \| 0.39 \| 45 \| 1.1216 \|
	\| 1.1334 \| 0.44 \| 50 \| 1.0412 \|
	\| 0.7445 \| 0.48 \| 55 \| 1.0829 \|
	\| 0.6512 \| 0.53 \| 60 \| 1.0443 \|
	\| 0.7516 \| 0.57 \| 65 \| 1.0089 \|
	\| 0.5953 \| 0.61 \| 70 \| 0.9273 \|
	\| 0.8589 \| 0.66 \| 75 \| 0.8947 \|
	\| 0.7561 \| 0.7 \| 80 \| 0.9009 \|
	\| 0.9561 \| 0.75 \| 85 \| 0.9006 \|
	\| 0.7731 \| 0.79 \| 90 \| 0.8482 \|
	\| 0.8269 \| 0.83 \| 95 \| 0.8380 \|
	\| 0.9884 \| 0.88 \| 100 \| 0.8200 \|
	\| 0.9187 \| 0.92 \| 105 \| 0.8775 \|
	\| 0.585 \| 0.96 \| 110 \| 0.8499 \|
	\| 0.6835 \| 1.01 \| 115 \| 0.8314 \|
	\| 0.6668 \| 1.05 \| 120 \| 0.7491 \|
	\| 0.5558 \| 1.1 \| 125 \| 0.7154 \|
	\| 0.4491 \| 1.14 \| 130 \| 0.8212 \|
	\| 1.0667 \| 1.18 \| 135 \| 0.8477 \|
	\| 0.4472 \| 1.23 \| 140 \| 0.7636 \|
	\| 0.6892 \| 1.27 \| 145 \| 0.7493 \|
	\| 0.66 \| 1.32 \| 150 \| 0.6932 \|
	\| 0.5044 \| 1.36 \| 155 \| 0.7675 \|
	\| 0.5329 \| 1.4 \| 160 \| 0.7406 \|
	\| 0.2223 \| 1.45 \| 165 \| 0.8099 \|
	\| 0.5495 \| 1.49 \| 170 \| 0.8758 \|
	\| 0.5534 \| 1.54 \| 175 \| 0.8476 \|
	\| 0.4962 \| 1.58 \| 180 \| 0.7953 \|
	\| 0.7477 \| 1.62 \| 185 \| 0.7610 \|
	\| 0.7293 \| 1.67 \| 190 \| 0.8357 \|
	\| 0.6205 \| 1.71 \| 195 \| 0.7339 \|
	\| 0.5687 \| 1.75 \| 200 \| 0.6908 \|
	\| 0.884 \| 1.8 \| 205 \| 0.6706 \|
	\| 0.5928 \| 1.84 \| 210 \| 0.6546 \|
	\| 0.3209 \| 1.89 \| 215 \| 0.6505 \|
	\| 0.7585 \| 1.93 \| 220 \| 0.6486 \|
	\| 0.8501 \| 1.97 \| 225 \| 0.6272 \|
	\| 0.1664 \| 2.02 \| 230 \| 0.6211 \|
	\| 0.4483 \| 2.06 \| 235 \| 0.6550 \|
	\| 0.3361 \| 2.11 \| 240 \| 0.6604 \|
	\| 0.3085 \| 2.15 \| 245 \| 0.6520 \|
	\| 0.2407 \| 2.19 \| 250 \| 0.6695 \|
	\| 0.3418 \| 2.24 \| 255 \| 0.6687 \|
	\| 0.3165 \| 2.28 \| 260 \| 0.6730 \|
	\| 0.5811 \| 2.32 \| 265 \| 0.6546 \|
	\| 0.3516 \| 2.37 \| 270 \| 0.6579 \|
	\| 0.3136 \| 2.41 \| 275 \| 0.6688 \|
	\| 0.2508 \| 2.46 \| 280 \| 0.6921 \|
	\| 0.3463 \| 2.5 \| 285 \| 0.7124 \|
	\| 0.3603 \| 2.54 \| 290 \| 0.7160 \|
	\| 0.4455 \| 2.59 \| 295 \| 0.6995 \|
	\| 0.5433 \| 2.63 \| 300 \| 0.6919 \|
	\| 0.3411 \| 2.68 \| 305 \| 0.6898 \|
	\| 0.6065 \| 2.72 \| 310 \| 0.6922 \|
	\| 0.6258 \| 2.76 \| 315 \| 0.6955 \|
	\| 0.283 \| 2.81 \| 320 \| 0.7008 \|
	\| 0.6233 \| 2.85 \| 325 \| 0.6988 \|
	\| 0.3899 \| 2.89 \| 330 \| 0.6949 \|
	\| 0.238 \| 2.94 \| 335 \| 0.6916 \|
	\| 0.3166 \| 2.98 \| 340 \| 0.6900 \|


	### Framework versions

	- Transformers 4.34.1
	- Pytorch 2.1.0+cu118
	- Datasets 2.14.6
	- Tokenizers 0.14.1

	---
	license: mit
	base_model: microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext
	tags:
	- generated_from_trainer
	datasets:
	- covid_qa_deepset
	model-index:
	- name: bert-covid
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# bert-covid

	This model is a fine-tuned version of [microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext](https://huggingface.co/microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract-fulltext) on the covid_qa_deepset dataset.
	It achieves the following results on the evaluation set:
	- Loss: 0.6900

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 3e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 3

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|
	\| 5.474 \| 0.04 \| 5 \| 4.3730 \|
	\| 3.9933 \| 0.09 \| 10 \| 3.2783 \|
	\| 3.0206 \| 0.13 \| 15 \| 2.0289 \|
	\| 1.9741 \| 0.18 \| 20 \| 1.3879 \|
	\| 1.4351 \| 0.22 \| 25 \| 1.1733 \|
	\| 1.5916 \| 0.26 \| 30 \| 1.1623 \|
	\| 0.5383 \| 0.31 \| 35 \| 1.1952 \|
	\| 0.7776 \| 0.35 \| 40 \| 1.1920 \|
	\| 1.1785 \| 0.39 \| 45 \| 1.1216 \|
	\| 1.1334 \| 0.44 \| 50 \| 1.0412 \|
	\| 0.7445 \| 0.48 \| 55 \| 1.0829 \|
	\| 0.6512 \| 0.53 \| 60 \| 1.0443 \|
	\| 0.7516 \| 0.57 \| 65 \| 1.0089 \|
	\| 0.5953 \| 0.61 \| 70 \| 0.9273 \|
	\| 0.8589 \| 0.66 \| 75 \| 0.8947 \|
	\| 0.7561 \| 0.7 \| 80 \| 0.9009 \|
	\| 0.9561 \| 0.75 \| 85 \| 0.9006 \|
	\| 0.7731 \| 0.79 \| 90 \| 0.8482 \|
	\| 0.8269 \| 0.83 \| 95 \| 0.8380 \|
	\| 0.9884 \| 0.88 \| 100 \| 0.8200 \|
	\| 0.9187 \| 0.92 \| 105 \| 0.8775 \|
	\| 0.585 \| 0.96 \| 110 \| 0.8499 \|
	\| 0.6835 \| 1.01 \| 115 \| 0.8314 \|
	\| 0.6668 \| 1.05 \| 120 \| 0.7491 \|
	\| 0.5558 \| 1.1 \| 125 \| 0.7154 \|
	\| 0.4491 \| 1.14 \| 130 \| 0.8212 \|
	\| 1.0667 \| 1.18 \| 135 \| 0.8477 \|
	\| 0.4472 \| 1.23 \| 140 \| 0.7636 \|
	\| 0.6892 \| 1.27 \| 145 \| 0.7493 \|
	\| 0.66 \| 1.32 \| 150 \| 0.6932 \|
	\| 0.5044 \| 1.36 \| 155 \| 0.7675 \|
	\| 0.5329 \| 1.4 \| 160 \| 0.7406 \|
	\| 0.2223 \| 1.45 \| 165 \| 0.8099 \|
	\| 0.5495 \| 1.49 \| 170 \| 0.8758 \|
	\| 0.5534 \| 1.54 \| 175 \| 0.8476 \|
	\| 0.4962 \| 1.58 \| 180 \| 0.7953 \|
	\| 0.7477 \| 1.62 \| 185 \| 0.7610 \|
	\| 0.7293 \| 1.67 \| 190 \| 0.8357 \|
	\| 0.6205 \| 1.71 \| 195 \| 0.7339 \|
	\| 0.5687 \| 1.75 \| 200 \| 0.6908 \|
	\| 0.884 \| 1.8 \| 205 \| 0.6706 \|
	\| 0.5928 \| 1.84 \| 210 \| 0.6546 \|
	\| 0.3209 \| 1.89 \| 215 \| 0.6505 \|
	\| 0.7585 \| 1.93 \| 220 \| 0.6486 \|
	\| 0.8501 \| 1.97 \| 225 \| 0.6272 \|
	\| 0.1664 \| 2.02 \| 230 \| 0.6211 \|
	\| 0.4483 \| 2.06 \| 235 \| 0.6550 \|
	\| 0.3361 \| 2.11 \| 240 \| 0.6604 \|
	\| 0.3085 \| 2.15 \| 245 \| 0.6520 \|
	\| 0.2407 \| 2.19 \| 250 \| 0.6695 \|
	\| 0.3418 \| 2.24 \| 255 \| 0.6687 \|
	\| 0.3165 \| 2.28 \| 260 \| 0.6730 \|
	\| 0.5811 \| 2.32 \| 265 \| 0.6546 \|
	\| 0.3516 \| 2.37 \| 270 \| 0.6579 \|
	\| 0.3136 \| 2.41 \| 275 \| 0.6688 \|
	\| 0.2508 \| 2.46 \| 280 \| 0.6921 \|
	\| 0.3463 \| 2.5 \| 285 \| 0.7124 \|
	\| 0.3603 \| 2.54 \| 290 \| 0.7160 \|
	\| 0.4455 \| 2.59 \| 295 \| 0.6995 \|
	\| 0.5433 \| 2.63 \| 300 \| 0.6919 \|
	\| 0.3411 \| 2.68 \| 305 \| 0.6898 \|
	\| 0.6065 \| 2.72 \| 310 \| 0.6922 \|
	\| 0.6258 \| 2.76 \| 315 \| 0.6955 \|
	\| 0.283 \| 2.81 \| 320 \| 0.7008 \|
	\| 0.6233 \| 2.85 \| 325 \| 0.6988 \|
	\| 0.3899 \| 2.89 \| 330 \| 0.6949 \|
	\| 0.238 \| 2.94 \| 335 \| 0.6916 \|
	\| 0.3166 \| 2.98 \| 340 \| 0.6900 \|


	### Framework versions

	- Transformers 4.34.1
	- Pytorch 2.1.0+cu118
	- Datasets 2.14.6
	- Tokenizers 0.14.1