| --- |
| tags: |
| - generated_from_trainer |
| metrics: |
| - accuracy |
| model-index: |
| - name: model_v1_complete_training_wt_init_48_tiny |
| results: [] |
| --- |
| |
| <!-- This model card has been generated automatically according to the information the Trainer had access to. You |
| should probably proofread and complete it, then remove this comment. --> |
|
|
| # model_v1_complete_training_wt_init_48_tiny |
| |
| This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset. |
| It achieves the following results on the evaluation set: |
| - Loss: 3.6497 |
| - Accuracy: 0.3896 |
| |
| ## Model description |
| |
| More information needed |
| |
| ## Intended uses & limitations |
| |
| More information needed |
| |
| ## Training and evaluation data |
| |
| More information needed |
| |
| ## Training procedure |
| |
| ### Training hyperparameters |
| |
| The following hyperparameters were used during training: |
| - learning_rate: 1e-05 |
| - train_batch_size: 64 |
| - eval_batch_size: 64 |
| - seed: 10 |
| - distributed_type: multi-GPU |
| - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
| - lr_scheduler_type: linear |
| - lr_scheduler_warmup_steps: 10000 |
| - num_epochs: 50 |
| |
| ### Training results |
| |
| | Training Loss | Epoch | Step | Validation Loss | Accuracy | |
| |:-------------:|:-----:|:-------:|:---------------:|:--------:| |
| | 6.0224 | 0.33 | 30000 | 5.9447 | 0.1517 | |
| | 5.1853 | 0.66 | 60000 | 4.9635 | 0.2615 | |
| | 4.9483 | 0.98 | 90000 | 4.7016 | 0.2830 | |
| | 4.7679 | 1.31 | 120000 | 4.5154 | 0.2992 | |
| | 4.6448 | 1.64 | 150000 | 4.3884 | 0.3100 | |
| | 4.5688 | 1.97 | 180000 | 4.3095 | 0.3175 | |
| | 4.5102 | 2.29 | 210000 | 4.2511 | 0.3236 | |
| | 4.4662 | 2.62 | 240000 | 4.2038 | 0.3294 | |
| | 4.4269 | 2.95 | 270000 | 4.1677 | 0.3336 | |
| | 4.3982 | 3.28 | 300000 | 4.1367 | 0.3370 | |
| | 4.3714 | 3.6 | 330000 | 4.1103 | 0.3399 | |
| | 4.3493 | 3.93 | 360000 | 4.0869 | 0.3423 | |
| | 4.3303 | 4.26 | 390000 | 4.0680 | 0.3439 | |
| | 4.3131 | 4.59 | 420000 | 4.0467 | 0.3461 | |
| | 4.2875 | 4.92 | 450000 | 4.0292 | 0.3477 | |
| | 4.2629 | 5.24 | 480000 | 4.0109 | 0.3497 | |
| | 4.2413 | 5.57 | 510000 | 3.9931 | 0.3515 | |
| | 4.2282 | 5.9 | 540000 | 3.9759 | 0.3536 | |
| | 4.2003 | 6.23 | 570000 | 3.9608 | 0.3551 | |
| | 4.1867 | 6.55 | 600000 | 3.9445 | 0.3571 | |
| | 4.1607 | 6.88 | 630000 | 3.9273 | 0.3590 | |
| | 4.1511 | 7.21 | 660000 | 3.9130 | 0.3606 | |
| | 4.1335 | 7.54 | 690000 | 3.8971 | 0.3622 | |
| | 4.1158 | 7.87 | 720000 | 3.8798 | 0.3642 | |
| | 4.097 | 8.19 | 750000 | 3.8635 | 0.3663 | |
| | 4.0831 | 8.52 | 780000 | 3.8494 | 0.3679 | |
| | 4.0756 | 8.85 | 810000 | 3.8334 | 0.3696 | |
| | 4.0533 | 9.18 | 840000 | 3.8201 | 0.3712 | |
| | 4.0517 | 9.5 | 870000 | 3.8080 | 0.3724 | |
| | 4.0325 | 9.83 | 900000 | 3.7975 | 0.3734 | |
| | 4.0142 | 10.16 | 930000 | 3.7872 | 0.3748 | |
| | 4.0124 | 10.49 | 960000 | 3.7788 | 0.3759 | |
| | 4.0076 | 10.81 | 990000 | 3.7679 | 0.3767 | |
| | 3.9919 | 11.14 | 1020000 | 3.7609 | 0.3775 | |
| | 3.9888 | 11.47 | 1050000 | 3.7550 | 0.3783 | |
| | 3.9796 | 11.8 | 1080000 | 3.7481 | 0.3789 | |
| | 3.9742 | 12.13 | 1110000 | 3.7414 | 0.3796 | |
| | 3.9667 | 12.45 | 1140000 | 3.7370 | 0.3802 | |
| | 3.9652 | 12.78 | 1170000 | 3.7289 | 0.3810 | |
| | 3.9548 | 13.11 | 1200000 | 3.7278 | 0.3812 | |
| | 3.9556 | 13.44 | 1230000 | 3.7213 | 0.3817 | |
| | 3.9444 | 13.76 | 1260000 | 3.7152 | 0.3825 | |
| | 3.9428 | 14.09 | 1290000 | 3.7120 | 0.3827 | |
| | 3.9424 | 14.42 | 1320000 | 3.7072 | 0.3834 | |
| | 3.9389 | 14.75 | 1350000 | 3.7047 | 0.3836 | |
| | 3.936 | 15.07 | 1380000 | 3.6998 | 0.3844 | |
| | 3.9246 | 15.4 | 1410000 | 3.6968 | 0.3847 | |
| | 3.9281 | 15.73 | 1440000 | 3.6925 | 0.3851 | |
| | 3.9177 | 16.06 | 1470000 | 3.6916 | 0.3849 | |
| | 3.9216 | 16.39 | 1500000 | 3.6870 | 0.3855 | |
| | 3.9141 | 16.71 | 1530000 | 3.6822 | 0.3863 | |
| | 3.9154 | 17.04 | 1560000 | 3.6804 | 0.3864 | |
| | 3.9145 | 17.37 | 1590000 | 3.6795 | 0.3863 | |
| | 3.9103 | 17.7 | 1620000 | 3.6734 | 0.3869 | |
| | 3.9079 | 18.02 | 1650000 | 3.6724 | 0.3873 | |
| | 3.901 | 18.35 | 1680000 | 3.6707 | 0.3872 | |
| | 3.9015 | 18.68 | 1710000 | 3.6695 | 0.3873 | |
| | 3.8987 | 19.01 | 1740000 | 3.6672 | 0.3877 | |
| | 3.8929 | 19.33 | 1770000 | 3.6647 | 0.3878 | |
| | 3.892 | 19.66 | 1800000 | 3.6609 | 0.3884 | |
| | 3.8906 | 19.99 | 1830000 | 3.6595 | 0.3886 | |
| | 3.8923 | 20.32 | 1860000 | 3.6594 | 0.3885 | |
| | 3.8901 | 20.65 | 1890000 | 3.6541 | 0.3893 | |
| | 3.8853 | 20.97 | 1920000 | 3.6539 | 0.3891 | |
| | 3.8808 | 21.3 | 1950000 | 3.6527 | 0.3894 | |
| | 3.8835 | 21.63 | 1980000 | 3.6497 | 0.3896 | |
| |
| |
| ### Framework versions |
| |
| - Transformers 4.30.2 |
| - Pytorch 1.14.0a0+410ce96 |
| - Datasets 2.13.0 |
| - Tokenizers 0.13.3 |
| |