End of training
Browse files- README.md +54 -14
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -18,16 +18,16 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 18 |
|
| 19 |
This model is a fine-tuned version of [deepset/bert-base-uncased-squad2](https://huggingface.co/deepset/bert-base-uncased-squad2) on the None dataset.
|
| 20 |
It achieves the following results on the evaluation set:
|
| 21 |
-
- Loss:
|
| 22 |
-
- Exact:
|
| 23 |
-
- F1:
|
| 24 |
-
- Total:
|
| 25 |
-
- Hasans Exact:
|
| 26 |
-
- Hasans F1:
|
| 27 |
-
- Hasans Total:
|
| 28 |
-
- Best Exact:
|
| 29 |
- Best Exact Thresh: 0.0
|
| 30 |
-
- Best F1:
|
| 31 |
- Best F1 Thresh: 0.0
|
| 32 |
|
| 33 |
## Model description
|
|
@@ -53,17 +53,57 @@ The following hyperparameters were used during training:
|
|
| 53 |
- seed: 42
|
| 54 |
- gradient_accumulation_steps: 2
|
| 55 |
- total_train_batch_size: 32
|
| 56 |
-
- optimizer: Use
|
| 57 |
- lr_scheduler_type: linear
|
| 58 |
- num_epochs: 100
|
| 59 |
- mixed_precision_training: Native AMP
|
| 60 |
|
| 61 |
### Training results
|
| 62 |
|
| 63 |
-
| Training Loss | Epoch
|
| 64 |
-
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-----:|:------------:|:---------:|:------------:|:----------:|:-----------------:|:-------:|:--------------:|
|
| 65 |
-
| 0.
|
| 66 |
-
| 0.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 67 |
|
| 68 |
|
| 69 |
### Framework versions
|
|
|
|
| 18 |
|
| 19 |
This model is a fine-tuned version of [deepset/bert-base-uncased-squad2](https://huggingface.co/deepset/bert-base-uncased-squad2) on the None dataset.
|
| 20 |
It achieves the following results on the evaluation set:
|
| 21 |
+
- Loss: 0.3456
|
| 22 |
+
- Exact: 78.0068
|
| 23 |
+
- F1: 82.8115
|
| 24 |
+
- Total: 15041
|
| 25 |
+
- Hasans Exact: 78.0068
|
| 26 |
+
- Hasans F1: 82.8115
|
| 27 |
+
- Hasans Total: 15041
|
| 28 |
+
- Best Exact: 78.0068
|
| 29 |
- Best Exact Thresh: 0.0
|
| 30 |
+
- Best F1: 82.8115
|
| 31 |
- Best F1 Thresh: 0.0
|
| 32 |
|
| 33 |
## Model description
|
|
|
|
| 53 |
- seed: 42
|
| 54 |
- gradient_accumulation_steps: 2
|
| 55 |
- total_train_batch_size: 32
|
| 56 |
+
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
|
| 57 |
- lr_scheduler_type: linear
|
| 58 |
- num_epochs: 100
|
| 59 |
- mixed_precision_training: Native AMP
|
| 60 |
|
| 61 |
### Training results
|
| 62 |
|
| 63 |
+
| Training Loss | Epoch | Step | Validation Loss | Exact | F1 | Total | Hasans Exact | Hasans F1 | Hasans Total | Best Exact | Best Exact Thresh | Best F1 | Best F1 Thresh |
|
| 64 |
+
|:-------------:|:------:|:----:|:---------------:|:-------:|:-------:|:-----:|:------------:|:---------:|:------------:|:----------:|:-----------------:|:-------:|:--------------:|
|
| 65 |
+
| 0.68 | 0.0265 | 100 | 0.5825 | 75.1479 | 81.0196 | 15041 | 75.1479 | 81.0196 | 15041 | 75.1479 | 0.0 | 81.0196 | 0.0 |
|
| 66 |
+
| 0.6035 | 0.0530 | 200 | 0.5453 | 75.5601 | 81.4615 | 15041 | 75.5601 | 81.4615 | 15041 | 75.5601 | 0.0 | 81.4615 | 0.0 |
|
| 67 |
+
| 0.5286 | 0.0795 | 300 | 0.5200 | 75.8527 | 81.4575 | 15041 | 75.8527 | 81.4575 | 15041 | 75.8527 | 0.0 | 81.4575 | 0.0 |
|
| 68 |
+
| 0.5957 | 0.1060 | 400 | 0.4978 | 76.3978 | 81.9120 | 15041 | 76.3978 | 81.9120 | 15041 | 76.3978 | 0.0 | 81.9120 | 0.0 |
|
| 69 |
+
| 0.5173 | 0.1325 | 500 | 0.4919 | 75.9457 | 81.4362 | 15041 | 75.9457 | 81.4362 | 15041 | 75.9457 | 0.0 | 81.4362 | 0.0 |
|
| 70 |
+
| 0.4847 | 0.1589 | 600 | 0.4845 | 76.4244 | 81.9856 | 15041 | 76.4244 | 81.9856 | 15041 | 76.4244 | 0.0 | 81.9856 | 0.0 |
|
| 71 |
+
| 0.4764 | 0.1854 | 700 | 0.4717 | 76.5441 | 82.1418 | 15041 | 76.5441 | 82.1418 | 15041 | 76.5441 | 0.0 | 82.1418 | 0.0 |
|
| 72 |
+
| 0.4707 | 0.2119 | 800 | 0.4601 | 76.8632 | 82.2239 | 15041 | 76.8632 | 82.2239 | 15041 | 76.8632 | 0.0 | 82.2239 | 0.0 |
|
| 73 |
+
| 0.4725 | 0.2384 | 900 | 0.4577 | 76.9829 | 82.3913 | 15041 | 76.9829 | 82.3913 | 15041 | 76.9829 | 0.0 | 82.3913 | 0.0 |
|
| 74 |
+
| 0.4905 | 0.2649 | 1000 | 0.4448 | 77.1092 | 82.3780 | 15041 | 77.1092 | 82.3780 | 15041 | 77.1092 | 0.0 | 82.3780 | 0.0 |
|
| 75 |
+
| 0.5156 | 0.2914 | 1100 | 0.4387 | 77.0826 | 82.3692 | 15041 | 77.0826 | 82.3692 | 15041 | 77.0826 | 0.0 | 82.3692 | 0.0 |
|
| 76 |
+
| 0.4482 | 0.3179 | 1200 | 0.4300 | 77.0162 | 82.2991 | 15041 | 77.0162 | 82.2991 | 15041 | 77.0162 | 0.0 | 82.2991 | 0.0 |
|
| 77 |
+
| 0.4548 | 0.3444 | 1300 | 0.4350 | 76.9497 | 82.1747 | 15041 | 76.9497 | 82.1747 | 15041 | 76.9497 | 0.0 | 82.1747 | 0.0 |
|
| 78 |
+
| 0.4344 | 0.3709 | 1400 | 0.4237 | 77.1957 | 82.5521 | 15041 | 77.1957 | 82.5521 | 15041 | 77.1957 | 0.0 | 82.5521 | 0.0 |
|
| 79 |
+
| 0.4431 | 0.3974 | 1500 | 0.4237 | 77.1092 | 82.4167 | 15041 | 77.1092 | 82.4167 | 15041 | 77.1092 | 0.0 | 82.4167 | 0.0 |
|
| 80 |
+
| 0.4462 | 0.4238 | 1600 | 0.4120 | 77.2754 | 82.4544 | 15041 | 77.2754 | 82.4544 | 15041 | 77.2754 | 0.0 | 82.4544 | 0.0 |
|
| 81 |
+
| 0.4353 | 0.4503 | 1700 | 0.4079 | 77.6411 | 82.8295 | 15041 | 77.6411 | 82.8295 | 15041 | 77.6411 | 0.0 | 82.8295 | 0.0 |
|
| 82 |
+
| 0.4344 | 0.4768 | 1800 | 0.4006 | 77.5347 | 82.8334 | 15041 | 77.5347 | 82.8334 | 15041 | 77.5347 | 0.0 | 82.8334 | 0.0 |
|
| 83 |
+
| 0.3835 | 0.5033 | 1900 | 0.4012 | 77.8140 | 82.9213 | 15041 | 77.8140 | 82.9213 | 15041 | 77.8140 | 0.0 | 82.9213 | 0.0 |
|
| 84 |
+
| 0.4618 | 0.5298 | 2000 | 0.3891 | 77.6345 | 82.8467 | 15041 | 77.6345 | 82.8467 | 15041 | 77.6345 | 0.0 | 82.8467 | 0.0 |
|
| 85 |
+
| 0.4156 | 0.5563 | 2100 | 0.3844 | 77.5613 | 82.7081 | 15041 | 77.5613 | 82.7081 | 15041 | 77.5613 | 0.0 | 82.7081 | 0.0 |
|
| 86 |
+
| 0.4051 | 0.5828 | 2200 | 0.3852 | 77.8871 | 82.8588 | 15041 | 77.8871 | 82.8588 | 15041 | 77.8871 | 0.0 | 82.8588 | 0.0 |
|
| 87 |
+
| 0.4071 | 0.6093 | 2300 | 0.3833 | 77.6810 | 82.8300 | 15041 | 77.6810 | 82.8300 | 15041 | 77.6810 | 0.0 | 82.8300 | 0.0 |
|
| 88 |
+
| 0.3738 | 0.6358 | 2400 | 0.3814 | 77.8938 | 83.0841 | 15041 | 77.8938 | 83.0841 | 15041 | 77.8938 | 0.0 | 83.0841 | 0.0 |
|
| 89 |
+
| 0.4027 | 0.6623 | 2500 | 0.3717 | 77.6012 | 82.6103 | 15041 | 77.6012 | 82.6103 | 15041 | 77.6012 | 0.0 | 82.6103 | 0.0 |
|
| 90 |
+
| 0.4326 | 0.6887 | 2600 | 0.3652 | 77.9469 | 82.9494 | 15041 | 77.9469 | 82.9494 | 15041 | 77.9469 | 0.0 | 82.9494 | 0.0 |
|
| 91 |
+
| 0.3293 | 0.7152 | 2700 | 0.3660 | 78.1265 | 83.0404 | 15041 | 78.1265 | 83.0404 | 15041 | 78.1265 | 0.0 | 83.0404 | 0.0 |
|
| 92 |
+
| 0.4206 | 0.7417 | 2800 | 0.3569 | 77.6544 | 82.5986 | 15041 | 77.6544 | 82.5986 | 15041 | 77.6544 | 0.0 | 82.5986 | 0.0 |
|
| 93 |
+
| 0.3474 | 0.7682 | 2900 | 0.3634 | 77.8339 | 82.7735 | 15041 | 77.8339 | 82.7735 | 15041 | 77.8339 | 0.0 | 82.7735 | 0.0 |
|
| 94 |
+
| 0.3742 | 0.7947 | 3000 | 0.3526 | 78.3326 | 83.1619 | 15041 | 78.3326 | 83.1619 | 15041 | 78.3326 | 0.0 | 83.1619 | 0.0 |
|
| 95 |
+
| 0.3992 | 0.8212 | 3100 | 0.3491 | 77.9735 | 82.7812 | 15041 | 77.9735 | 82.7812 | 15041 | 77.9735 | 0.0 | 82.7812 | 0.0 |
|
| 96 |
+
| 0.4146 | 0.8477 | 3200 | 0.3492 | 78.3459 | 83.1638 | 15041 | 78.3459 | 83.1638 | 15041 | 78.3459 | 0.0 | 83.1638 | 0.0 |
|
| 97 |
+
| 0.3934 | 0.8742 | 3300 | 0.3444 | 77.8605 | 82.6116 | 15041 | 77.8605 | 82.6116 | 15041 | 77.8605 | 0.0 | 82.6116 | 0.0 |
|
| 98 |
+
| 0.3673 | 0.9007 | 3400 | 0.3465 | 77.9403 | 82.7155 | 15041 | 77.9403 | 82.7155 | 15041 | 77.9403 | 0.0 | 82.7155 | 0.0 |
|
| 99 |
+
| 0.4128 | 0.9272 | 3500 | 0.3406 | 77.8738 | 82.7600 | 15041 | 77.8738 | 82.7600 | 15041 | 77.8738 | 0.0 | 82.7600 | 0.0 |
|
| 100 |
+
| 0.3976 | 0.9536 | 3600 | 0.3368 | 78.0533 | 82.8822 | 15041 | 78.0533 | 82.8822 | 15041 | 78.0533 | 0.0 | 82.8822 | 0.0 |
|
| 101 |
+
| 0.4077 | 0.9801 | 3700 | 0.3392 | 77.7408 | 82.5339 | 15041 | 77.7408 | 82.5339 | 15041 | 77.7408 | 0.0 | 82.5339 | 0.0 |
|
| 102 |
+
| 0.3512 | 1.0066 | 3800 | 0.3395 | 78.1331 | 82.8586 | 15041 | 78.1331 | 82.8586 | 15041 | 78.1331 | 0.0 | 82.8586 | 0.0 |
|
| 103 |
+
| 0.2996 | 1.0331 | 3900 | 0.3442 | 77.7010 | 82.5547 | 15041 | 77.7010 | 82.5547 | 15041 | 77.7010 | 0.0 | 82.5547 | 0.0 |
|
| 104 |
+
| 0.2646 | 1.0596 | 4000 | 0.3471 | 78.0999 | 82.8657 | 15041 | 78.0999 | 82.8657 | 15041 | 78.0999 | 0.0 | 82.8657 | 0.0 |
|
| 105 |
+
| 0.2925 | 1.0861 | 4100 | 0.3470 | 77.9270 | 82.5864 | 15041 | 77.9270 | 82.5864 | 15041 | 77.9270 | 0.0 | 82.5864 | 0.0 |
|
| 106 |
+
| 0.2995 | 1.1126 | 4200 | 0.3456 | 78.0068 | 82.8115 | 15041 | 78.0068 | 82.8115 | 15041 | 78.0068 | 0.0 | 82.8115 | 0.0 |
|
| 107 |
|
| 108 |
|
| 109 |
### Framework versions
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 435596088
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:c50a565915ef163cb8b39237e23c92e89676d43d36f175d525ddda208a601a07
|
| 3 |
size 435596088
|