sameearif commited on
Commit
59f75f0
·
verified ·
1 Parent(s): ce0855e

Model save

Browse files
README.md CHANGED
@@ -19,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
19
 
20
  This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
- - Loss: 0.4315
23
- - Accuracy: 0.8725
24
- - F1: 0.8724
25
 
26
  ## Model description
27
 
@@ -40,10 +40,12 @@ More information needed
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
- - learning_rate: 2e-05
44
  - train_batch_size: 8
45
  - eval_batch_size: 8
46
  - seed: 42
 
 
47
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
  - num_epochs: 10
@@ -52,16 +54,16 @@ The following hyperparameters were used during training:
52
 
53
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
54
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
55
- | 0.4204 | 1.0 | 455 | 0.4447 | 0.7978 | 0.7908 |
56
- | 0.2772 | 2.0 | 910 | 0.3623 | 0.8484 | 0.8484 |
57
- | 0.3451 | 3.0 | 1365 | 0.4462 | 0.8593 | 0.8587 |
58
- | 0.2596 | 4.0 | 1820 | 0.4315 | 0.8725 | 0.8724 |
59
- | 0.1125 | 5.0 | 2275 | 0.6506 | 0.8593 | 0.8587 |
60
- | 0.1344 | 6.0 | 2730 | 0.6835 | 0.8549 | 0.8541 |
61
- | 0.108 | 7.0 | 3185 | 0.7018 | 0.8659 | 0.8656 |
62
- | 0.0229 | 8.0 | 3640 | 0.8865 | 0.8681 | 0.8680 |
63
- | 0.0459 | 9.0 | 4095 | 0.9492 | 0.8571 | 0.8570 |
64
- | 0.0043 | 10.0 | 4550 | 0.9753 | 0.8681 | 0.8679 |
65
 
66
 
67
  ### Framework versions
 
19
 
20
  This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) on an unknown dataset.
21
  It achieves the following results on the evaluation set:
22
+ - Loss: 0.2608
23
+ - Accuracy: 0.8813
24
+ - F1: 0.8805
25
 
26
  ## Model description
27
 
 
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
+ - learning_rate: 1e-05
44
  - train_batch_size: 8
45
  - eval_batch_size: 8
46
  - seed: 42
47
+ - gradient_accumulation_steps: 2
48
+ - total_train_batch_size: 16
49
  - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
  - num_epochs: 10
 
54
 
55
  | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 |
56
  |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
57
+ | 0.4418 | 1.0 | 228 | 0.3463 | 0.8396 | 0.8397 |
58
+ | 0.3375 | 2.0 | 456 | 0.2615 | 0.8703 | 0.8705 |
59
+ | 0.2706 | 3.0 | 684 | 0.2608 | 0.8813 | 0.8805 |
60
+ | 0.2298 | 4.0 | 912 | 0.3437 | 0.8791 | 0.8780 |
61
+ | 0.1609 | 5.0 | 1140 | 0.6636 | 0.8132 | 0.8050 |
62
+ | 0.1665 | 6.0 | 1368 | 0.5089 | 0.8791 | 0.8791 |
63
+ | 0.099 | 7.0 | 1596 | 0.6432 | 0.8813 | 0.8804 |
64
+ | 0.075 | 8.0 | 1824 | 0.7101 | 0.8747 | 0.8741 |
65
+ | 0.044 | 9.0 | 2052 | 0.7694 | 0.8681 | 0.8673 |
66
+ | 0.0478 | 10.0 | 2280 | 0.8504 | 0.8593 | 0.8573 |
67
 
68
 
69
  ### Framework versions
deberta-v3-large_best/model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9a60ee9b8ceeaf8b630b39b8018bd175e98bdd47a690aa27a615cb37383505cc
3
  size 1740304440
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4f5a51140e9ad305ae618b22dc56824f8c6d10309013766021d29ec3bf2b8f65
3
  size 1740304440
deberta-v3-large_best/tokenizer.json CHANGED
@@ -2,13 +2,13 @@
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
- "max_length": 512,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
9
  "padding": {
10
  "strategy": {
11
- "Fixed": 512
12
  },
13
  "direction": "Right",
14
  "pad_to_multiple_of": null,
 
2
  "version": "1.0",
3
  "truncation": {
4
  "direction": "Right",
5
+ "max_length": 1024,
6
  "strategy": "LongestFirst",
7
  "stride": 0
8
  },
9
  "padding": {
10
  "strategy": {
11
+ "Fixed": 1024
12
  },
13
  "direction": "Right",
14
  "pad_to_multiple_of": null,
deberta-v3-large_best/training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:41da7a6569c6587f7ee050f8b6c35b6ce57fe847d360f4e974eb9bfaee9440b3
3
  size 5432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:34f303bf3d5d63c9682b4fb143369657b7c642bf36e8bb56121ff005202ca0a6
3
  size 5432
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1df11b411d740d57a1563dd5906903771641b78d0b3f0669d077481dd3d07a28
3
  size 1740304440
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4f5a51140e9ad305ae618b22dc56824f8c6d10309013766021d29ec3bf2b8f65
3
  size 1740304440