update model card README.md
Browse files
README.md
CHANGED
|
@@ -15,7 +15,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 15 |
|
| 16 |
This model is a fine-tuned version of [](https://huggingface.co/) on the generator dataset.
|
| 17 |
It achieves the following results on the evaluation set:
|
| 18 |
-
- Loss: 3.
|
| 19 |
|
| 20 |
## Model description
|
| 21 |
|
|
@@ -41,7 +41,7 @@ The following hyperparameters were used during training:
|
|
| 41 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
| 42 |
- lr_scheduler_type: cosine
|
| 43 |
- lr_scheduler_warmup_steps: 1000
|
| 44 |
-
- num_epochs:
|
| 45 |
- mixed_precision_training: Native AMP
|
| 46 |
|
| 47 |
### Training results
|
|
@@ -90,6 +90,45 @@ The following hyperparameters were used during training:
|
|
| 90 |
| 3.4877 | 9.33 | 20000 | 3.5692 |
|
| 91 |
| 3.4818 | 9.57 | 20500 | 3.5641 |
|
| 92 |
| 3.4844 | 9.8 | 21000 | 3.5640 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 93 |
|
| 94 |
|
| 95 |
### Framework versions
|
|
|
|
| 15 |
|
| 16 |
This model is a fine-tuned version of [](https://huggingface.co/) on the generator dataset.
|
| 17 |
It achieves the following results on the evaluation set:
|
| 18 |
+
- Loss: 3.2321
|
| 19 |
|
| 20 |
## Model description
|
| 21 |
|
|
|
|
| 41 |
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
| 42 |
- lr_scheduler_type: cosine
|
| 43 |
- lr_scheduler_warmup_steps: 1000
|
| 44 |
+
- num_epochs: 19
|
| 45 |
- mixed_precision_training: Native AMP
|
| 46 |
|
| 47 |
### Training results
|
|
|
|
| 90 |
| 3.4877 | 9.33 | 20000 | 3.5692 |
|
| 91 |
| 3.4818 | 9.57 | 20500 | 3.5641 |
|
| 92 |
| 3.4844 | 9.8 | 21000 | 3.5640 |
|
| 93 |
+
| 3.5323 | 10.03 | 21500 | 3.6026 |
|
| 94 |
+
| 3.5123 | 10.27 | 22000 | 3.5877 |
|
| 95 |
+
| 3.5046 | 10.5 | 22500 | 3.5595 |
|
| 96 |
+
| 3.4787 | 10.73 | 23000 | 3.5403 |
|
| 97 |
+
| 3.4568 | 10.97 | 23500 | 3.5125 |
|
| 98 |
+
| 3.4154 | 11.2 | 24000 | 3.4916 |
|
| 99 |
+
| 3.3998 | 11.43 | 24500 | 3.4749 |
|
| 100 |
+
| 3.3986 | 11.67 | 25000 | 3.4578 |
|
| 101 |
+
| 3.372 | 11.9 | 25500 | 3.4405 |
|
| 102 |
+
| 3.3402 | 12.13 | 26000 | 3.4317 |
|
| 103 |
+
| 3.3281 | 12.37 | 26500 | 3.4215 |
|
| 104 |
+
| 3.322 | 12.6 | 27000 | 3.4093 |
|
| 105 |
+
| 3.3198 | 12.83 | 27500 | 3.4026 |
|
| 106 |
+
| 3.3039 | 13.07 | 28000 | 3.3971 |
|
| 107 |
+
| 3.296 | 13.3 | 28500 | 3.3954 |
|
| 108 |
+
| 3.3015 | 13.53 | 29000 | 3.3954 |
|
| 109 |
+
| 3.2939 | 13.77 | 29500 | 3.3927 |
|
| 110 |
+
| 3.3013 | 14.0 | 30000 | 3.3918 |
|
| 111 |
+
| 3.343 | 14.23 | 30500 | 3.4265 |
|
| 112 |
+
| 3.3438 | 14.47 | 31000 | 3.4133 |
|
| 113 |
+
| 3.3397 | 14.7 | 31500 | 3.3951 |
|
| 114 |
+
| 3.3156 | 14.93 | 32000 | 3.3681 |
|
| 115 |
+
| 3.2815 | 15.17 | 32500 | 3.3503 |
|
| 116 |
+
| 3.2654 | 15.4 | 33000 | 3.3313 |
|
| 117 |
+
| 3.2492 | 15.63 | 33500 | 3.3184 |
|
| 118 |
+
| 3.2399 | 15.87 | 34000 | 3.2995 |
|
| 119 |
+
| 3.2222 | 16.1 | 34500 | 3.2922 |
|
| 120 |
+
| 3.2026 | 16.33 | 35000 | 3.2818 |
|
| 121 |
+
| 3.191 | 16.57 | 35500 | 3.2723 |
|
| 122 |
+
| 3.1825 | 16.8 | 36000 | 3.2640 |
|
| 123 |
+
| 3.1691 | 17.03 | 36500 | 3.2530 |
|
| 124 |
+
| 3.1656 | 17.27 | 37000 | 3.2487 |
|
| 125 |
+
| 3.1487 | 17.5 | 37500 | 3.2419 |
|
| 126 |
+
| 3.1635 | 17.73 | 38000 | 3.2411 |
|
| 127 |
+
| 3.1675 | 17.97 | 38500 | 3.2330 |
|
| 128 |
+
| 3.1422 | 18.2 | 39000 | 3.2344 |
|
| 129 |
+
| 3.1443 | 18.43 | 39500 | 3.2331 |
|
| 130 |
+
| 3.1425 | 18.67 | 40000 | 3.2348 |
|
| 131 |
+
| 3.139 | 18.9 | 40500 | 3.2321 |
|
| 132 |
|
| 133 |
|
| 134 |
### Framework versions
|