Model save

Browse files

Files changed (5) hide show

README.md +16 -14
deberta-v3-large_best/model.safetensors +1 -1
deberta-v3-large_best/tokenizer.json +2 -2
deberta-v3-large_best/training_args.bin +1 -1
model.safetensors +1 -1

README.md CHANGED Viewed

@@ -19,9 +19,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.4315
-- Accuracy: 0.8725
-- F1: 0.8724
 ## Model description
@@ -40,10 +40,12 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 2e-05
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 10
@@ -52,16 +54,16 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
-| 0.4204        | 1.0   | 455  | 0.4447          | 0.7978   | 0.7908 |
-| 0.2772        | 2.0   | 910  | 0.3623          | 0.8484   | 0.8484 |
-| 0.3451        | 3.0   | 1365 | 0.4462          | 0.8593   | 0.8587 |
-| 0.2596        | 4.0   | 1820 | 0.4315          | 0.8725   | 0.8724 |
-| 0.1125        | 5.0   | 2275 | 0.6506          | 0.8593   | 0.8587 |
-| 0.1344        | 6.0   | 2730 | 0.6835          | 0.8549   | 0.8541 |
-| 0.108         | 7.0   | 3185 | 0.7018          | 0.8659   | 0.8656 |
-| 0.0229        | 8.0   | 3640 | 0.8865          | 0.8681   | 0.8680 |
-| 0.0459        | 9.0   | 4095 | 0.9492          | 0.8571   | 0.8570 |
-| 0.0043        | 10.0  | 4550 | 0.9753          | 0.8681   | 0.8679 |
 ### Framework versions

 This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggingface.co/microsoft/deberta-v3-large) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.2608
+- Accuracy: 0.8813
+- F1: 0.8805
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 1e-05
 - train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
 - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: linear
 - num_epochs: 10
 | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1     |
 |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|
+| 0.4418        | 1.0   | 228  | 0.3463          | 0.8396   | 0.8397 |
+| 0.3375        | 2.0   | 456  | 0.2615          | 0.8703   | 0.8705 |
+| 0.2706        | 3.0   | 684  | 0.2608          | 0.8813   | 0.8805 |
+| 0.2298        | 4.0   | 912  | 0.3437          | 0.8791   | 0.8780 |
+| 0.1609        | 5.0   | 1140 | 0.6636          | 0.8132   | 0.8050 |
+| 0.1665        | 6.0   | 1368 | 0.5089          | 0.8791   | 0.8791 |
+| 0.099         | 7.0   | 1596 | 0.6432          | 0.8813   | 0.8804 |
+| 0.075         | 8.0   | 1824 | 0.7101          | 0.8747   | 0.8741 |
+| 0.044         | 9.0   | 2052 | 0.7694          | 0.8681   | 0.8673 |
+| 0.0478        | 10.0  | 2280 | 0.8504          | 0.8593   | 0.8573 |
 ### Framework versions

deberta-v3-large_best/model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9a60ee9b8ceeaf8b630b39b8018bd175e98bdd47a690aa27a615cb37383505cc
 size 1740304440

 version https://git-lfs.github.com/spec/v1
+oid sha256:4f5a51140e9ad305ae618b22dc56824f8c6d10309013766021d29ec3bf2b8f65
 size 1740304440

deberta-v3-large_best/tokenizer.json CHANGED Viewed

@@ -2,13 +2,13 @@
   "version": "1.0",
   "truncation": {
     "direction": "Right",
-    "max_length": 512,
     "strategy": "LongestFirst",
     "stride": 0
   },
   "padding": {
     "strategy": {
-      "Fixed": 512
     },
     "direction": "Right",
     "pad_to_multiple_of": null,

   "version": "1.0",
   "truncation": {
     "direction": "Right",
+    "max_length": 1024,
     "strategy": "LongestFirst",
     "stride": 0
   },
   "padding": {
     "strategy": {
+      "Fixed": 1024
     },
     "direction": "Right",
     "pad_to_multiple_of": null,

deberta-v3-large_best/training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:41da7a6569c6587f7ee050f8b6c35b6ce57fe847d360f4e974eb9bfaee9440b3
 size 5432

 version https://git-lfs.github.com/spec/v1
+oid sha256:34f303bf3d5d63c9682b4fb143369657b7c642bf36e8bb56121ff005202ca0a6
 size 5432

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1df11b411d740d57a1563dd5906903771641b78d0b3f0669d077481dd3d07a28
 size 1740304440

 version https://git-lfs.github.com/spec/v1
+oid sha256:4f5a51140e9ad305ae618b22dc56824f8c6d10309013766021d29ec3bf2b8f65
 size 1740304440