update model card README.md

2a9607f almost 3 years ago

5.63 kB

	---
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: model_v1_complete_training_wt_init_48_tiny
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# model_v1_complete_training_wt_init_48_tiny

	This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.6497
	- Accuracy: 0.3896

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 10
	- distributed_type: multi-GPU
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 10000
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:-------:\|:---------------:\|:--------:\|
	\| 6.0224 \| 0.33 \| 30000 \| 5.9447 \| 0.1517 \|
	\| 5.1853 \| 0.66 \| 60000 \| 4.9635 \| 0.2615 \|
	\| 4.9483 \| 0.98 \| 90000 \| 4.7016 \| 0.2830 \|
	\| 4.7679 \| 1.31 \| 120000 \| 4.5154 \| 0.2992 \|
	\| 4.6448 \| 1.64 \| 150000 \| 4.3884 \| 0.3100 \|
	\| 4.5688 \| 1.97 \| 180000 \| 4.3095 \| 0.3175 \|
	\| 4.5102 \| 2.29 \| 210000 \| 4.2511 \| 0.3236 \|
	\| 4.4662 \| 2.62 \| 240000 \| 4.2038 \| 0.3294 \|
	\| 4.4269 \| 2.95 \| 270000 \| 4.1677 \| 0.3336 \|
	\| 4.3982 \| 3.28 \| 300000 \| 4.1367 \| 0.3370 \|
	\| 4.3714 \| 3.6 \| 330000 \| 4.1103 \| 0.3399 \|
	\| 4.3493 \| 3.93 \| 360000 \| 4.0869 \| 0.3423 \|
	\| 4.3303 \| 4.26 \| 390000 \| 4.0680 \| 0.3439 \|
	\| 4.3131 \| 4.59 \| 420000 \| 4.0467 \| 0.3461 \|
	\| 4.2875 \| 4.92 \| 450000 \| 4.0292 \| 0.3477 \|
	\| 4.2629 \| 5.24 \| 480000 \| 4.0109 \| 0.3497 \|
	\| 4.2413 \| 5.57 \| 510000 \| 3.9931 \| 0.3515 \|
	\| 4.2282 \| 5.9 \| 540000 \| 3.9759 \| 0.3536 \|
	\| 4.2003 \| 6.23 \| 570000 \| 3.9608 \| 0.3551 \|
	\| 4.1867 \| 6.55 \| 600000 \| 3.9445 \| 0.3571 \|
	\| 4.1607 \| 6.88 \| 630000 \| 3.9273 \| 0.3590 \|
	\| 4.1511 \| 7.21 \| 660000 \| 3.9130 \| 0.3606 \|
	\| 4.1335 \| 7.54 \| 690000 \| 3.8971 \| 0.3622 \|
	\| 4.1158 \| 7.87 \| 720000 \| 3.8798 \| 0.3642 \|
	\| 4.097 \| 8.19 \| 750000 \| 3.8635 \| 0.3663 \|
	\| 4.0831 \| 8.52 \| 780000 \| 3.8494 \| 0.3679 \|
	\| 4.0756 \| 8.85 \| 810000 \| 3.8334 \| 0.3696 \|
	\| 4.0533 \| 9.18 \| 840000 \| 3.8201 \| 0.3712 \|
	\| 4.0517 \| 9.5 \| 870000 \| 3.8080 \| 0.3724 \|
	\| 4.0325 \| 9.83 \| 900000 \| 3.7975 \| 0.3734 \|
	\| 4.0142 \| 10.16 \| 930000 \| 3.7872 \| 0.3748 \|
	\| 4.0124 \| 10.49 \| 960000 \| 3.7788 \| 0.3759 \|
	\| 4.0076 \| 10.81 \| 990000 \| 3.7679 \| 0.3767 \|
	\| 3.9919 \| 11.14 \| 1020000 \| 3.7609 \| 0.3775 \|
	\| 3.9888 \| 11.47 \| 1050000 \| 3.7550 \| 0.3783 \|
	\| 3.9796 \| 11.8 \| 1080000 \| 3.7481 \| 0.3789 \|
	\| 3.9742 \| 12.13 \| 1110000 \| 3.7414 \| 0.3796 \|
	\| 3.9667 \| 12.45 \| 1140000 \| 3.7370 \| 0.3802 \|
	\| 3.9652 \| 12.78 \| 1170000 \| 3.7289 \| 0.3810 \|
	\| 3.9548 \| 13.11 \| 1200000 \| 3.7278 \| 0.3812 \|
	\| 3.9556 \| 13.44 \| 1230000 \| 3.7213 \| 0.3817 \|
	\| 3.9444 \| 13.76 \| 1260000 \| 3.7152 \| 0.3825 \|
	\| 3.9428 \| 14.09 \| 1290000 \| 3.7120 \| 0.3827 \|
	\| 3.9424 \| 14.42 \| 1320000 \| 3.7072 \| 0.3834 \|
	\| 3.9389 \| 14.75 \| 1350000 \| 3.7047 \| 0.3836 \|
	\| 3.936 \| 15.07 \| 1380000 \| 3.6998 \| 0.3844 \|
	\| 3.9246 \| 15.4 \| 1410000 \| 3.6968 \| 0.3847 \|
	\| 3.9281 \| 15.73 \| 1440000 \| 3.6925 \| 0.3851 \|
	\| 3.9177 \| 16.06 \| 1470000 \| 3.6916 \| 0.3849 \|
	\| 3.9216 \| 16.39 \| 1500000 \| 3.6870 \| 0.3855 \|
	\| 3.9141 \| 16.71 \| 1530000 \| 3.6822 \| 0.3863 \|
	\| 3.9154 \| 17.04 \| 1560000 \| 3.6804 \| 0.3864 \|
	\| 3.9145 \| 17.37 \| 1590000 \| 3.6795 \| 0.3863 \|
	\| 3.9103 \| 17.7 \| 1620000 \| 3.6734 \| 0.3869 \|
	\| 3.9079 \| 18.02 \| 1650000 \| 3.6724 \| 0.3873 \|
	\| 3.901 \| 18.35 \| 1680000 \| 3.6707 \| 0.3872 \|
	\| 3.9015 \| 18.68 \| 1710000 \| 3.6695 \| 0.3873 \|
	\| 3.8987 \| 19.01 \| 1740000 \| 3.6672 \| 0.3877 \|
	\| 3.8929 \| 19.33 \| 1770000 \| 3.6647 \| 0.3878 \|
	\| 3.892 \| 19.66 \| 1800000 \| 3.6609 \| 0.3884 \|
	\| 3.8906 \| 19.99 \| 1830000 \| 3.6595 \| 0.3886 \|
	\| 3.8923 \| 20.32 \| 1860000 \| 3.6594 \| 0.3885 \|
	\| 3.8901 \| 20.65 \| 1890000 \| 3.6541 \| 0.3893 \|
	\| 3.8853 \| 20.97 \| 1920000 \| 3.6539 \| 0.3891 \|
	\| 3.8808 \| 21.3 \| 1950000 \| 3.6527 \| 0.3894 \|
	\| 3.8835 \| 21.63 \| 1980000 \| 3.6497 \| 0.3896 \|


	### Framework versions

	- Transformers 4.30.2
	- Pytorch 1.14.0a0+410ce96
	- Datasets 2.13.0
	- Tokenizers 0.13.3

	---
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	model-index:
	- name: model_v1_complete_training_wt_init_48_tiny
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# model_v1_complete_training_wt_init_48_tiny

	This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
	It achieves the following results on the evaluation set:
	- Loss: 3.6497
	- Accuracy: 0.3896

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 1e-05
	- train_batch_size: 64
	- eval_batch_size: 64
	- seed: 10
	- distributed_type: multi-GPU
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 10000
	- num_epochs: 50

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \|
	\|:-------------:\|:-----:\|:-------:\|:---------------:\|:--------:\|
	\| 6.0224 \| 0.33 \| 30000 \| 5.9447 \| 0.1517 \|
	\| 5.1853 \| 0.66 \| 60000 \| 4.9635 \| 0.2615 \|
	\| 4.9483 \| 0.98 \| 90000 \| 4.7016 \| 0.2830 \|
	\| 4.7679 \| 1.31 \| 120000 \| 4.5154 \| 0.2992 \|
	\| 4.6448 \| 1.64 \| 150000 \| 4.3884 \| 0.3100 \|
	\| 4.5688 \| 1.97 \| 180000 \| 4.3095 \| 0.3175 \|
	\| 4.5102 \| 2.29 \| 210000 \| 4.2511 \| 0.3236 \|
	\| 4.4662 \| 2.62 \| 240000 \| 4.2038 \| 0.3294 \|
	\| 4.4269 \| 2.95 \| 270000 \| 4.1677 \| 0.3336 \|
	\| 4.3982 \| 3.28 \| 300000 \| 4.1367 \| 0.3370 \|
	\| 4.3714 \| 3.6 \| 330000 \| 4.1103 \| 0.3399 \|
	\| 4.3493 \| 3.93 \| 360000 \| 4.0869 \| 0.3423 \|
	\| 4.3303 \| 4.26 \| 390000 \| 4.0680 \| 0.3439 \|
	\| 4.3131 \| 4.59 \| 420000 \| 4.0467 \| 0.3461 \|
	\| 4.2875 \| 4.92 \| 450000 \| 4.0292 \| 0.3477 \|
	\| 4.2629 \| 5.24 \| 480000 \| 4.0109 \| 0.3497 \|
	\| 4.2413 \| 5.57 \| 510000 \| 3.9931 \| 0.3515 \|
	\| 4.2282 \| 5.9 \| 540000 \| 3.9759 \| 0.3536 \|
	\| 4.2003 \| 6.23 \| 570000 \| 3.9608 \| 0.3551 \|
	\| 4.1867 \| 6.55 \| 600000 \| 3.9445 \| 0.3571 \|
	\| 4.1607 \| 6.88 \| 630000 \| 3.9273 \| 0.3590 \|
	\| 4.1511 \| 7.21 \| 660000 \| 3.9130 \| 0.3606 \|
	\| 4.1335 \| 7.54 \| 690000 \| 3.8971 \| 0.3622 \|
	\| 4.1158 \| 7.87 \| 720000 \| 3.8798 \| 0.3642 \|
	\| 4.097 \| 8.19 \| 750000 \| 3.8635 \| 0.3663 \|
	\| 4.0831 \| 8.52 \| 780000 \| 3.8494 \| 0.3679 \|
	\| 4.0756 \| 8.85 \| 810000 \| 3.8334 \| 0.3696 \|
	\| 4.0533 \| 9.18 \| 840000 \| 3.8201 \| 0.3712 \|
	\| 4.0517 \| 9.5 \| 870000 \| 3.8080 \| 0.3724 \|
	\| 4.0325 \| 9.83 \| 900000 \| 3.7975 \| 0.3734 \|
	\| 4.0142 \| 10.16 \| 930000 \| 3.7872 \| 0.3748 \|
	\| 4.0124 \| 10.49 \| 960000 \| 3.7788 \| 0.3759 \|
	\| 4.0076 \| 10.81 \| 990000 \| 3.7679 \| 0.3767 \|
	\| 3.9919 \| 11.14 \| 1020000 \| 3.7609 \| 0.3775 \|
	\| 3.9888 \| 11.47 \| 1050000 \| 3.7550 \| 0.3783 \|
	\| 3.9796 \| 11.8 \| 1080000 \| 3.7481 \| 0.3789 \|
	\| 3.9742 \| 12.13 \| 1110000 \| 3.7414 \| 0.3796 \|
	\| 3.9667 \| 12.45 \| 1140000 \| 3.7370 \| 0.3802 \|
	\| 3.9652 \| 12.78 \| 1170000 \| 3.7289 \| 0.3810 \|
	\| 3.9548 \| 13.11 \| 1200000 \| 3.7278 \| 0.3812 \|
	\| 3.9556 \| 13.44 \| 1230000 \| 3.7213 \| 0.3817 \|
	\| 3.9444 \| 13.76 \| 1260000 \| 3.7152 \| 0.3825 \|
	\| 3.9428 \| 14.09 \| 1290000 \| 3.7120 \| 0.3827 \|
	\| 3.9424 \| 14.42 \| 1320000 \| 3.7072 \| 0.3834 \|
	\| 3.9389 \| 14.75 \| 1350000 \| 3.7047 \| 0.3836 \|
	\| 3.936 \| 15.07 \| 1380000 \| 3.6998 \| 0.3844 \|
	\| 3.9246 \| 15.4 \| 1410000 \| 3.6968 \| 0.3847 \|
	\| 3.9281 \| 15.73 \| 1440000 \| 3.6925 \| 0.3851 \|
	\| 3.9177 \| 16.06 \| 1470000 \| 3.6916 \| 0.3849 \|
	\| 3.9216 \| 16.39 \| 1500000 \| 3.6870 \| 0.3855 \|
	\| 3.9141 \| 16.71 \| 1530000 \| 3.6822 \| 0.3863 \|
	\| 3.9154 \| 17.04 \| 1560000 \| 3.6804 \| 0.3864 \|
	\| 3.9145 \| 17.37 \| 1590000 \| 3.6795 \| 0.3863 \|
	\| 3.9103 \| 17.7 \| 1620000 \| 3.6734 \| 0.3869 \|
	\| 3.9079 \| 18.02 \| 1650000 \| 3.6724 \| 0.3873 \|
	\| 3.901 \| 18.35 \| 1680000 \| 3.6707 \| 0.3872 \|
	\| 3.9015 \| 18.68 \| 1710000 \| 3.6695 \| 0.3873 \|
	\| 3.8987 \| 19.01 \| 1740000 \| 3.6672 \| 0.3877 \|
	\| 3.8929 \| 19.33 \| 1770000 \| 3.6647 \| 0.3878 \|
	\| 3.892 \| 19.66 \| 1800000 \| 3.6609 \| 0.3884 \|
	\| 3.8906 \| 19.99 \| 1830000 \| 3.6595 \| 0.3886 \|
	\| 3.8923 \| 20.32 \| 1860000 \| 3.6594 \| 0.3885 \|
	\| 3.8901 \| 20.65 \| 1890000 \| 3.6541 \| 0.3893 \|
	\| 3.8853 \| 20.97 \| 1920000 \| 3.6539 \| 0.3891 \|
	\| 3.8808 \| 21.3 \| 1950000 \| 3.6527 \| 0.3894 \|
	\| 3.8835 \| 21.63 \| 1980000 \| 3.6497 \| 0.3896 \|


	### Framework versions

	- Transformers 4.30.2
	- Pytorch 1.14.0a0+410ce96
	- Datasets 2.13.0
	- Tokenizers 0.13.3