makhataei commited on
Commit
1e383ab
1 Parent(s): 9bdf782

End of training

Browse files
README.md CHANGED
@@ -1,6 +1,6 @@
1
  ---
2
  license: mit
3
- base_model: microsoft/mdeberta-v3-base
4
  tags:
5
  - generated_from_trainer
6
  datasets:
@@ -15,9 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  # qa-fa-mdeberta-v3-base
17
 
18
- This model is a fine-tuned version of [microsoft/mdeberta-v3-base](https://huggingface.co/microsoft/mdeberta-v3-base) on the pquad dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 0.8308
21
 
22
  ## Model description
23
 
@@ -36,7 +36,7 @@ More information needed
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
- - learning_rate: 0.0001
40
  - train_batch_size: 5
41
  - eval_batch_size: 5
42
  - seed: 42
@@ -48,31 +48,31 @@ The following hyperparameters were used during training:
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:-----:|:---------------:|
51
- | 1.962 | 0.04 | 500 | 1.3311 |
52
- | 1.4591 | 0.08 | 1000 | 1.3656 |
53
- | 1.3366 | 0.12 | 1500 | 1.3459 |
54
- | 1.2211 | 0.16 | 2000 | 1.2812 |
55
- | 1.1793 | 0.19 | 2500 | 1.1125 |
56
- | 1.1316 | 0.23 | 3000 | 1.1440 |
57
- | 1.0693 | 0.27 | 3500 | 1.1141 |
58
- | 1.0541 | 0.31 | 4000 | 1.2000 |
59
- | 1.0774 | 0.35 | 4500 | 1.1321 |
60
- | 1.0284 | 0.39 | 5000 | 0.9901 |
61
- | 0.9806 | 0.43 | 5500 | 0.9894 |
62
- | 0.9341 | 0.47 | 6000 | 0.9512 |
63
- | 0.9368 | 0.51 | 6500 | 0.9508 |
64
- | 0.9111 | 0.55 | 7000 | 1.0454 |
65
- | 0.8872 | 0.58 | 7500 | 0.9940 |
66
- | 0.8283 | 0.62 | 8000 | 0.8971 |
67
- | 0.9249 | 0.66 | 8500 | 0.8796 |
68
- | 0.8478 | 0.7 | 9000 | 0.9074 |
69
- | 0.8167 | 0.74 | 9500 | 0.8703 |
70
- | 0.8102 | 0.78 | 10000 | 0.8432 |
71
- | 0.7354 | 0.82 | 10500 | 0.8772 |
72
- | 0.7711 | 0.86 | 11000 | 0.8493 |
73
- | 0.7443 | 0.9 | 11500 | 0.8409 |
74
- | 0.7156 | 0.94 | 12000 | 0.8420 |
75
- | 0.7242 | 0.97 | 12500 | 0.8308 |
76
 
77
 
78
  ### Framework versions
 
1
  ---
2
  license: mit
3
+ base_model: makhataei/qa-fa-mdeberta-v3-base
4
  tags:
5
  - generated_from_trainer
6
  datasets:
 
15
 
16
  # qa-fa-mdeberta-v3-base
17
 
18
+ This model is a fine-tuned version of [makhataei/qa-fa-mdeberta-v3-base](https://huggingface.co/makhataei/qa-fa-mdeberta-v3-base) on the pquad dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 0.8913
21
 
22
  ## Model description
23
 
 
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
+ - learning_rate: 5e-05
40
  - train_batch_size: 5
41
  - eval_batch_size: 5
42
  - seed: 42
 
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:-----:|:---------------:|
51
+ | 0.7194 | 0.04 | 500 | 0.9858 |
52
+ | 0.7041 | 0.08 | 1000 | 1.1216 |
53
+ | 0.717 | 0.12 | 1500 | 0.9554 |
54
+ | 0.6405 | 0.16 | 2000 | 1.0941 |
55
+ | 0.6519 | 0.19 | 2500 | 1.0454 |
56
+ | 0.6657 | 0.23 | 3000 | 1.0125 |
57
+ | 0.5935 | 0.27 | 3500 | 0.9870 |
58
+ | 0.6249 | 0.31 | 4000 | 0.9457 |
59
+ | 0.594 | 0.35 | 4500 | 1.0947 |
60
+ | 0.5691 | 0.39 | 5000 | 1.0590 |
61
+ | 0.5472 | 0.43 | 5500 | 1.0174 |
62
+ | 0.5326 | 0.47 | 6000 | 1.1249 |
63
+ | 0.5418 | 0.51 | 6500 | 0.9881 |
64
+ | 0.5449 | 0.55 | 7000 | 1.0413 |
65
+ | 0.544 | 0.58 | 7500 | 1.1565 |
66
+ | 0.4944 | 0.62 | 8000 | 1.1200 |
67
+ | 0.5673 | 0.66 | 8500 | 1.0735 |
68
+ | 0.5304 | 0.7 | 9000 | 1.0188 |
69
+ | 0.4968 | 0.74 | 9500 | 0.9884 |
70
+ | 0.5498 | 0.78 | 10000 | 0.9572 |
71
+ | 0.4904 | 0.82 | 10500 | 1.0120 |
72
+ | 0.5796 | 0.86 | 11000 | 0.9639 |
73
+ | 0.5656 | 0.9 | 11500 | 0.9420 |
74
+ | 0.6069 | 0.94 | 12000 | 0.9291 |
75
+ | 0.6555 | 0.97 | 12500 | 0.8913 |
76
 
77
 
78
  ### Framework versions
config.json CHANGED
@@ -1,5 +1,5 @@
1
  {
2
- "_name_or_path": "microsoft/mdeberta-v3-base",
3
  "architectures": [
4
  "DebertaV2ForQuestionAnswering"
5
  ],
 
1
  {
2
+ "_name_or_path": "makhataei/qa-fa-mdeberta-v3-base",
3
  "architectures": [
4
  "DebertaV2ForQuestionAnswering"
5
  ],
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:a48e75323351afb1452a476fed57d69030460efc0388d1ec3a71a5603fef2d48
3
  size 1112905680
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5eb3c1efcb517fd473af2089195b7fd6432db3c0685f0b03889ef33c61041d53
3
  size 1112905680
runs/Dec04_09-28-12_Software-AI/events.out.tfevents.1701669492.Software-AI.2960647.1 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:72716af66d7b272aa34e09334e5ea4deb7daaec62da7ba5230e9d3e8ec083043
3
+ size 15519
special_tokens_map.json CHANGED
@@ -1,10 +1,46 @@
1
  {
2
- "bos_token": "[CLS]",
3
- "cls_token": "[CLS]",
4
- "eos_token": "[SEP]",
5
- "mask_token": "[MASK]",
6
- "pad_token": "[PAD]",
7
- "sep_token": "[SEP]",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
  "unk_token": {
9
  "content": "[UNK]",
10
  "lstrip": false,
 
1
  {
2
+ "bos_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "[CLS]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "[SEP]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "[MASK]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "[PAD]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "[SEP]",
39
+ "lstrip": false,
40
+ "normalized": false,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
  "unk_token": {
45
  "content": "[UNK]",
46
  "lstrip": false,
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3aca3ce69a0a35aeb144a52c4f1d41c4246b8785f8f398315cc8fb6b24057810
3
- size 16331396
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6ac908bbb6af2dd36f1c7a0be71f06538859b806793f556154b12786c8452a87
3
+ size 16316246
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b66819b024f240ea53f002de22c83afa385bfcb7cff678f2f697b5cd7b092f10
3
  size 4155
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4e3d4e231795267eb932a9186395ba9e98aad0250bd294f76ce9015d1ee053e9
3
  size 4155