--- tags: - generated_from_trainer datasets: - generator model-index: - name: bert-dp-second results: [] --- # bert-dp-second This model is a fine-tuned version of [](https://huggingface.co/) on the generator dataset. It achieves the following results on the evaluation set: - Loss: 3.2321 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0005 - train_batch_size: 64 - eval_batch_size: 64 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: cosine - lr_scheduler_warmup_steps: 1000 - num_epochs: 19 - mixed_precision_training: Native AMP ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:-----:|:-----:|:---------------:| | 7.3416 | 0.23 | 500 | 6.6532 | | 6.5752 | 0.47 | 1000 | 6.5275 | | 6.4866 | 0.7 | 1500 | 6.4720 | | 6.4273 | 0.93 | 2000 | 6.4540 | | 6.4036 | 1.17 | 2500 | 6.4236 | | 6.3779 | 1.4 | 3000 | 6.4018 | | 6.3528 | 1.63 | 3500 | 6.3768 | | 6.3258 | 1.87 | 4000 | 6.3679 | | 6.3009 | 2.1 | 4500 | 6.3305 | | 6.2646 | 2.33 | 5000 | 6.3142 | | 6.2583 | 2.57 | 5500 | 6.3004 | | 6.2223 | 2.8 | 6000 | 6.2605 | | 6.1941 | 3.03 | 6500 | 6.2353 | | 6.1382 | 3.27 | 7000 | 6.2095 | | 6.1301 | 3.5 | 7500 | 6.1774 | | 6.09 | 3.73 | 8000 | 6.1480 | | 6.0624 | 3.97 | 8500 | 6.1061 | | 6.0056 | 4.2 | 9000 | 6.0655 | | 5.9444 | 4.43 | 9500 | 5.9461 | | 5.7101 | 4.67 | 10000 | 5.2594 | | 5.005 | 4.9 | 10500 | 4.7348 | | 4.6127 | 5.13 | 11000 | 4.4626 | | 4.3907 | 5.37 | 11500 | 4.2862 | | 4.241 | 5.6 | 12000 | 4.1701 | | 4.1286 | 5.83 | 12500 | 4.0673 | | 4.0151 | 6.07 | 13000 | 3.9967 | | 3.934 | 6.3 | 13500 | 3.9292 | | 3.8789 | 6.53 | 14000 | 3.8707 | | 3.8231 | 6.77 | 14500 | 3.8222 | | 3.7696 | 7.0 | 15000 | 3.7800 | | 3.7078 | 7.23 | 15500 | 3.7424 | | 3.6671 | 7.47 | 16000 | 3.7093 | | 3.6446 | 7.7 | 16500 | 3.6780 | | 3.6069 | 7.93 | 17000 | 3.6476 | | 3.5782 | 8.17 | 17500 | 3.6283 | | 3.5384 | 8.4 | 18000 | 3.6098 | | 3.5245 | 8.63 | 18500 | 3.5942 | | 3.5209 | 8.87 | 19000 | 3.5841 | | 3.4948 | 9.1 | 19500 | 3.5728 | | 3.4877 | 9.33 | 20000 | 3.5692 | | 3.4818 | 9.57 | 20500 | 3.5641 | | 3.4844 | 9.8 | 21000 | 3.5640 | | 3.5323 | 10.03 | 21500 | 3.6026 | | 3.5123 | 10.27 | 22000 | 3.5877 | | 3.5046 | 10.5 | 22500 | 3.5595 | | 3.4787 | 10.73 | 23000 | 3.5403 | | 3.4568 | 10.97 | 23500 | 3.5125 | | 3.4154 | 11.2 | 24000 | 3.4916 | | 3.3998 | 11.43 | 24500 | 3.4749 | | 3.3986 | 11.67 | 25000 | 3.4578 | | 3.372 | 11.9 | 25500 | 3.4405 | | 3.3402 | 12.13 | 26000 | 3.4317 | | 3.3281 | 12.37 | 26500 | 3.4215 | | 3.322 | 12.6 | 27000 | 3.4093 | | 3.3198 | 12.83 | 27500 | 3.4026 | | 3.3039 | 13.07 | 28000 | 3.3971 | | 3.296 | 13.3 | 28500 | 3.3954 | | 3.3015 | 13.53 | 29000 | 3.3954 | | 3.2939 | 13.77 | 29500 | 3.3927 | | 3.3013 | 14.0 | 30000 | 3.3918 | | 3.343 | 14.23 | 30500 | 3.4265 | | 3.3438 | 14.47 | 31000 | 3.4133 | | 3.3397 | 14.7 | 31500 | 3.3951 | | 3.3156 | 14.93 | 32000 | 3.3681 | | 3.2815 | 15.17 | 32500 | 3.3503 | | 3.2654 | 15.4 | 33000 | 3.3313 | | 3.2492 | 15.63 | 33500 | 3.3184 | | 3.2399 | 15.87 | 34000 | 3.2995 | | 3.2222 | 16.1 | 34500 | 3.2922 | | 3.2026 | 16.33 | 35000 | 3.2818 | | 3.191 | 16.57 | 35500 | 3.2723 | | 3.1825 | 16.8 | 36000 | 3.2640 | | 3.1691 | 17.03 | 36500 | 3.2530 | | 3.1656 | 17.27 | 37000 | 3.2487 | | 3.1487 | 17.5 | 37500 | 3.2419 | | 3.1635 | 17.73 | 38000 | 3.2411 | | 3.1675 | 17.97 | 38500 | 3.2330 | | 3.1422 | 18.2 | 39000 | 3.2344 | | 3.1443 | 18.43 | 39500 | 3.2331 | | 3.1425 | 18.67 | 40000 | 3.2348 | | 3.139 | 18.9 | 40500 | 3.2321 | ### Framework versions - Transformers 4.26.1 - Pytorch 1.11.0+cu113 - Datasets 2.13.0 - Tokenizers 0.13.3