model update

Browse files

Files changed (8) hide show

README.md +26 -26
config.json +1 -1
eval/metric.json +1 -1
eval/metric_span.json +1 -1
eval/prediction.validation.json +0 -0
pytorch_model.bin +2 -2
tokenizer_config.json +1 -1
trainer_config.json +1 -1

README.md CHANGED Viewed

@@ -18,31 +18,31 @@ model-index:
     metrics:
     - name: F1
       type: f1
-      value: 0.6430868167202574
     - name: Precision
       type: precision
-      value: 0.6578947368421053
     - name: Recall
       type: recall
-      value: 0.6289308176100629
     - name: F1 (macro)
       type: f1_macro
-      value: 0.37234464254803534
     - name: Precision (macro)
       type: precision_macro
-      value: 0.3758815642868512
     - name: Recall (macro)
       type: recall_macro
-      value: 0.3836106023606024
     - name: F1 (entity span)
       type: f1_entity_span
-      value: 0.6883116883116883
     - name: Precision (entity span)
       type: precision_entity_span
-      value: 0.7043189368770764
     - name: Recall (entity span)
       type: recall_entity_span
-      value: 0.6730158730158731
 pipeline_tag: token-classification
 widget:
@@ -55,26 +55,26 @@ This model is a fine-tuned version of [microsoft/deberta-v3-large](https://huggi
 [tner/fin](https://huggingface.co/datasets/tner/fin) dataset.
 Model fine-tuning is done via [T-NER](https://github.com/asahi417/tner)'s hyper-parameter search (see the repository
 for more detail). It achieves the following results on the test set:
-- F1 (micro): 0.6430868167202574
-- Precision (micro): 0.6578947368421053
-- Recall (micro): 0.6289308176100629
-- F1 (macro): 0.37234464254803534
-- Precision (macro): 0.3758815642868512
-- Recall (macro): 0.3836106023606024
 The per-entity breakdown of the F1 score on the test set are below:
-- LOC: nan
-- MISC: nan
-- ORG: nan
-- PER: nan
 For F1 scores, the confidence interval is obtained by bootstrap as below:
 - F1 (micro):
-    - 90%: [0.5722111059165758, 0.7112704135498799]
-    - 95%: [0.557944362785127, 0.725353903079494]
 - F1 (macro):
-    - 90%: [0.5722111059165758, 0.7112704135498799]
-    - 95%: [0.557944362785127, 0.725353903079494]
 Full evaluation can be found at [metric file of NER](https://huggingface.co/tner/deberta-v3-large-fin/raw/main/eval/metric.json)
 and [metric file of entity span](https://huggingface.co/tner/deberta-v3-large-fin/raw/main/eval/metric_span.json).
@@ -100,14 +100,14 @@ The following hyperparameters were used during training:
  - dataset_name: None
  - local_dataset: None
  - model: microsoft/deberta-v3-large
- - crf: False
  - max_length: 128
- - epoch: 17
  - batch_size: 16
  - lr: 1e-05
  - random_seed: 42
  - gradient_accumulation_steps: 4
- - weight_decay: 1e-07
  - lr_warmup_step_ratio: 0.1
  - max_grad_norm: 10.0

     metrics:
     - name: F1
       type: f1
+      value: 0.7060755336617406
     - name: Precision
       type: precision
+      value: 0.738831615120275
     - name: Recall
       type: recall
+      value: 0.6761006289308176
     - name: F1 (macro)
       type: f1_macro
+      value: 0.45092058848834204
     - name: Precision (macro)
       type: precision_macro
+      value: 0.45426465258085835
     - name: Recall (macro)
       type: recall_macro
+      value: 0.45582773707773705
     - name: F1 (entity span)
       type: f1_entity_span
+      value: 0.7293729372937293
     - name: Precision (entity span)
       type: precision_entity_span
+      value: 0.7594501718213058
     - name: Recall (entity span)
       type: recall_entity_span
+      value: 0.7015873015873015
 pipeline_tag: token-classification
 widget:
 [tner/fin](https://huggingface.co/datasets/tner/fin) dataset.
 Model fine-tuning is done via [T-NER](https://github.com/asahi417/tner)'s hyper-parameter search (see the repository
 for more detail). It achieves the following results on the test set:
+- F1 (micro): 0.7060755336617406
+- Precision (micro): 0.738831615120275
+- Recall (micro): 0.6761006289308176
+- F1 (macro): 0.45092058848834204
+- Precision (macro): 0.45426465258085835
+- Recall (macro): 0.45582773707773705
 The per-entity breakdown of the F1 score on the test set are below:
+- location: 0.4000000000000001
+- organization: 0.5762711864406779
+- other: 0.0
+- person: 0.8274111675126904
 For F1 scores, the confidence interval is obtained by bootstrap as below:
 - F1 (micro):
+    - 90%: [0.6370316240330781, 0.7718233002182738]
+    - 95%: [0.6236274300363168, 0.7857205513784461]
 - F1 (macro):
+    - 90%: [0.6370316240330781, 0.7718233002182738]
+    - 95%: [0.6236274300363168, 0.7857205513784461]
 Full evaluation can be found at [metric file of NER](https://huggingface.co/tner/deberta-v3-large-fin/raw/main/eval/metric.json)
 and [metric file of entity span](https://huggingface.co/tner/deberta-v3-large-fin/raw/main/eval/metric_span.json).
  - dataset_name: None
  - local_dataset: None
  - model: microsoft/deberta-v3-large
+ - crf: True
  - max_length: 128
+ - epoch: 15
  - batch_size: 16
  - lr: 1e-05
  - random_seed: 42
  - gradient_accumulation_steps: 4
+ - weight_decay: None
  - lr_warmup_step_ratio: 0.1
  - max_grad_norm: 10.0

config.json CHANGED Viewed

@@ -1,5 +1,5 @@
 {
-  "_name_or_path": "tner_ckpt/fin_deberta_v3_large/best_model",
   "architectures": [
     "DebertaV2ForTokenClassification"
   ],

 {
+  "_name_or_path": "tner_ckpt/fin_deberta_v3_large/model_rcsnba/epoch_5",
   "architectures": [
     "DebertaV2ForTokenClassification"
   ],

eval/metric.json CHANGED Viewed

@@ -1 +1 @@

- {"micro/f1": 0.~~6430868167202574~~, "micro/f1_ci": {"90": [0.~~5722111059165758~~, 0.~~7112704135498799~~], "95": [0.~~557944362785127~~, 0.~~725353903079494~~]}, "micro/recall": 0.~~6289308176100629~~, "micro/precision": 0.~~6578947368421053~~, "macro/f1": 0.~~37234464254803534~~, "macro/f1_ci": {"90": [0.~~321037444212583~~, 0.~~4174222520031422~~], "95": [0.~~3126661472561014~~, 0.~~4276527473028317~~]}, "macro/recall": 0.~~3836106023606024~~, "macro/precision": 0.~~3758815642868512~~, "per_entity_metric": {"~~LOC~~": {"f1": ~~NaN~~, "f1_ci": {"90": [~~NaN~~, ~~NaN~~], "95": [~~NaN~~, ~~NaN~~]}, "precision": 0.0, "recall": 0.0}, "~~MISC~~": {"f1": ~~NaN~~, "f1_ci": {"90": [~~NaN~~, ~~NaN~~], "95": [~~NaN~~, ~~NaN~~]}, "precision": 0.0, "recall": 0.0}, "~~ORG~~": {"f1": ~~NaN~~, "f1_ci": {"90": [NaN, NaN], "95": [NaN, NaN]}, "precision": 0.0, "recall": 0.0}, "~~PER~~": {"f1": ~~NaN~~, "f1_ci": {"90": [~~NaN~~, ~~NaN~~], "95": [~~NaN~~, ~~NaN~~]}, "precision": 0.0, "recall": 0.0}}}

+ {"micro/f1": 0.7060755336617406, "micro/f1_ci": {"90": [0.6370316240330781, 0.7718233002182738], "95": [0.6236274300363168, 0.7857205513784461]}, "micro/recall": 0.6761006289308176, "micro/precision": 0.738831615120275, "macro/f1": 0.45092058848834204, "macro/f1_ci": {"90": [0.39899778804703784, 0.5011709891949974], "95": [0.3874931369771246, 0.5136520300021123]}, "macro/recall": 0.45582773707773705, "macro/precision": 0.45426465258085835, "per_entity_metric": {"location": {"f1": 0.4000000000000001, "f1_ci": {"90": [0.2857142857142857, 0.5091682785299806], "95": [0.2608695652173913, 0.5263157894736842]}, "precision": 0.35294117647058826, "recall": 0.46153846153846156}, "organization": {"f1": 0.5762711864406779, "f1_ci": {"90": [0.43634996582365004, 0.7079700983894904], "95": [0.4077472341386317, 0.7342135894078278]}, "precision": 0.5483870967741935, "recall": 0.6071428571428571}, "other": {"f1": 0.0, "f1_ci": {"90": [NaN, NaN], "95": [NaN, NaN]}, "precision": 0.0, "recall": 0.0}, "person": {"f1": 0.8274111675126904, "f1_ci": {"90": [0.7651849599675686, 0.8840794949060123], "95": [0.7459896055540471, 0.8967844202898553]}, "precision": 0.9157303370786517, "recall": 0.7546296296296297}}}

eval/metric_span.json CHANGED Viewed

@@ -1 +1 @@

- {"micro/f1": 0.~~6883116883116883~~, "micro/f1_ci": {"90": [0.~~6137984272716044~~, 0.~~757765305655086~~], "95": [0.~~604156373368873~~, 0.~~7718631178707224~~]}, "micro/recall": 0.~~6730158730158731~~, "micro/precision": 0.~~7043189368770764~~, "macro/f1": 0.~~6883116883116883~~, "macro/f1_ci": {"90": [0.~~6137984272716044~~, 0.~~757765305655086~~], "95": [0.~~604156373368873~~, 0.~~7718631178707224~~]}, "macro/recall": 0.~~6730158730158731~~, "macro/precision": 0.~~7043189368770764~~}

+ {"micro/f1": 0.7293729372937293, "micro/f1_ci": {"90": [0.6546727092010601, 0.7960558252427186], "95": [0.6427420490321417, 0.8090595359078592]}, "micro/recall": 0.7015873015873015, "micro/precision": 0.7594501718213058, "macro/f1": 0.7293729372937293, "macro/f1_ci": {"90": [0.6546727092010601, 0.7960558252427186], "95": [0.6427420490321417, 0.8090595359078592]}, "macro/recall": 0.7015873015873015, "macro/precision": 0.7594501718213058}

eval/prediction.validation.json CHANGED Viewed

The diff for this file is too large to render. See raw diff

pytorch_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f45adf48b9766bb5a576a605b96bb0325487f0e5ad3848967fc00fd616c9e8c1
-size 1736217519

 version https://git-lfs.github.com/spec/v1
+oid sha256:bad3729608b27d27e70df820e6cc552dbe034d5ed064cbe4ac5c1f6e5a008727
+size 1736223023

tokenizer_config.json CHANGED Viewed

@@ -4,7 +4,7 @@
   "do_lower_case": false,
   "eos_token": "[SEP]",
   "mask_token": "[MASK]",
-  "name_or_path": "tner_ckpt/fin_deberta_v3_large/best_model",
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "sp_model_kwargs": {},

   "do_lower_case": false,
   "eos_token": "[SEP]",
   "mask_token": "[MASK]",
+  "name_or_path": "tner_ckpt/fin_deberta_v3_large/model_rcsnba/epoch_5",
   "pad_token": "[PAD]",
   "sep_token": "[SEP]",
   "sp_model_kwargs": {},

trainer_config.json CHANGED Viewed

	@@ -1 +1 @@
1	- {"dataset": ["tner/fin"], "dataset_split": "train", "dataset_name": null, "local_dataset": null, "model": "microsoft/deberta-v3-large", "crf": ~~false~~, "max_length": 128, "epoch": 17, "batch_size": 16, "lr": 1e-05, "random_seed": 42, "gradient_accumulation_steps": 4, "weight_decay": ~~1e-07~~, "lr_warmup_step_ratio": 0.1, "max_grad_norm": 10.0}


1	+ {"dataset": ["tner/fin"], "dataset_split": "train", "dataset_name": null, "local_dataset": null, "model": "microsoft/deberta-v3-large", "crf": true, "max_length": 128, "epoch": 15, "batch_size": 16, "lr": 1e-05, "random_seed": 42, "gradient_accumulation_steps": 4, "weight_decay": null, "lr_warmup_step_ratio": 0.1, "max_grad_norm": 10.0}