betterdataai commited on
Commit
1810257
Β·
verified Β·
1 Parent(s): bb3fad1

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,109 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: "gliner-pii-silver-v1"
3
+ library_name: gliner
4
+ pipeline_tag: token-classification
5
+ license: apache-2.0
6
+ base_model: urchade/gliner_multi-v2.1
7
+ language:
8
+ - de
9
+ - en
10
+ - es
11
+ - fr
12
+ - id
13
+ - it
14
+ - ja
15
+ - ko
16
+ - nl
17
+ - sv
18
+ - vi
19
+ - zh
20
+ tags:
21
+ - gliner
22
+ - ner
23
+ - pii
24
+ - phi
25
+ - pci
26
+ ---
27
+
28
+ # gliner-pii-silver-v1
29
+
30
+ ## Model Summary
31
+ This GLiNER model is trained for multilingual NER/PII detection using LLM-annotated data.
32
+
33
+ ## Usage (Python)
34
+ ```python
35
+ import json
36
+ from pathlib import Path
37
+
38
+ import torch
39
+ from gliner import GLiNER
40
+
41
+ model_path = "betterdataai/gliner-pii-silver-v1"
42
+ device = "cuda" if torch.cuda.is_available() else "cpu"
43
+
44
+ model = GLiNER.from_pretrained(model_path, map_location=device)
45
+ if device == "cuda":
46
+ model = model.to(device)
47
+
48
+ # Load the canonical labels used for evaluation/training.
49
+ labels = json.loads(Path(model_path, "label_schema.json").read_text(encoding="utf-8"))["labels"]
50
+
51
+ texts = [
52
+ "Contact John Doe at john.doe@example.com or +1 (415) 555-2671.",
53
+ "Ship to: 1600 Amphitheatre Parkway, Mountain View, CA 94043.",
54
+ ]
55
+
56
+ preds = model.inference(
57
+ texts,
58
+ labels,
59
+ batch_size=8,
60
+ threshold=0.6,
61
+ flat_ner=True,
62
+ multi_label=False,
63
+ )
64
+
65
+ for text, pred in zip(texts, preds):
66
+ print(text)
67
+ for ent in pred:
68
+ print(ent)
69
+ ```
70
+
71
+ ## Training Data
72
+ - Total records: 387736
73
+ - Train/Validation/Test: 348958 / 19382 / 19396
74
+ - Label coverage: 84 / 88
75
+
76
+ ## Training Setup
77
+ - Base model: urchade/gliner_multi-v2.1
78
+ - Max length: 384
79
+ - Max width: 12
80
+ - Train batch size: 16
81
+ - Eval batch size: 8
82
+ - Gradient accumulation: 4
83
+ - Learning rate: 1e-05
84
+ - Epochs: 2
85
+ - Eval threshold: 0.6
86
+
87
+ ## Evaluation
88
+ - Evaluated on: data/splits/test.jsonl
89
+ - Precision: 0.2837
90
+ - Recall: 0.3189
91
+ - F1: 0.3003
92
+
93
+ ## Performance Comparison
94
+ | Model | Precision | Recall | F1 |
95
+ | --- | --- | --- | --- |
96
+ | gliner-pii-silver-v1 | 0.2837 | 0.3189 | 0.3003 |
97
+ | urchade/gliner_multi-v2.1 | 0.1707 | 0.0006 | 0.0011 |
98
+ | nvidia/gliner-PII | 0.0851 | 0.2973 | 0.1323 |
99
+ | gretelai/gretel-gliner-bi-base-v1.0 | 0.1410 | 0.1296 | 0.1351 |
100
+ | knowledgator/gliner-pii-base-v1.0 | 0.1520 | 0.1284 | 0.1392 |
101
+
102
+ ## Files
103
+ - Model artifacts: `gliner_config.json`, `pytorch_model.bin`, `tokenizer.json`, `tokenizer_config.json`.
104
+ - Evaluation artifacts: `metrics.json`, `benchmark_metrics.json`.
105
+ - Metadata: `label_schema.json`, `training_config.json`.
106
+
107
+ ## Limitations
108
+ - LLM-annotated data may contain noise.
109
+ - Some labels remain sparse across languages.
benchmark_metrics.json ADDED
@@ -0,0 +1,2582 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "gliner-pii-silver-v1": {
3
+ "overall": {
4
+ "precision": 0.2837362121567707,
5
+ "recall": 0.3189384546389849,
6
+ "f1": 0.30030925146242404
7
+ },
8
+ "macro": {
9
+ "precision": 0.10309275828076389,
10
+ "recall": 0.11927487570986003,
11
+ "f1": 0.08813121523006051
12
+ },
13
+ "by_label": {
14
+ "name": {
15
+ "precision": 0.07758620689655173,
16
+ "recall": 0.011873350923482849,
17
+ "f1": 0.020594965675057208
18
+ },
19
+ "first_name": {
20
+ "precision": 0.25936811168258633,
21
+ "recall": 0.13799843627834246,
22
+ "f1": 0.1801479969379944
23
+ },
24
+ "middle_name": {
25
+ "precision": 0.0,
26
+ "recall": 0.0,
27
+ "f1": 0.0
28
+ },
29
+ "last_name": {
30
+ "precision": 0.0,
31
+ "recall": 0.0,
32
+ "f1": 0.0
33
+ },
34
+ "full_name": {
35
+ "precision": 0.6101829753381066,
36
+ "recall": 0.067387102442453,
37
+ "f1": 0.12137036157923888
38
+ },
39
+ "user_name": {
40
+ "precision": 0.3359375,
41
+ "recall": 0.18777292576419213,
42
+ "f1": 0.24089635854341737
43
+ },
44
+ "email": {
45
+ "precision": 0.4766355140186916,
46
+ "recall": 0.34459459459459457,
47
+ "f1": 0.4
48
+ },
49
+ "phone_number": {
50
+ "precision": 0.6865671641791045,
51
+ "recall": 0.27058823529411763,
52
+ "f1": 0.38818565400843885
53
+ },
54
+ "fax_number": {
55
+ "precision": 0.0,
56
+ "recall": 0.0,
57
+ "f1": 0.0
58
+ },
59
+ "street_address": {
60
+ "precision": 0.20294117647058824,
61
+ "recall": 0.18956043956043955,
62
+ "f1": 0.1960227272727273
63
+ },
64
+ "city": {
65
+ "precision": 0.32969146726167575,
66
+ "recall": 0.7649053909318101,
67
+ "f1": 0.46077746115382545
68
+ },
69
+ "state": {
70
+ "precision": 0.33976510067114096,
71
+ "recall": 0.7330316742081447,
72
+ "f1": 0.4643164230438521
73
+ },
74
+ "county": {
75
+ "precision": 0.28700564971751413,
76
+ "recall": 0.335978835978836,
77
+ "f1": 0.30956733698964045
78
+ },
79
+ "postal_code": {
80
+ "precision": 0.1891891891891892,
81
+ "recall": 0.09333333333333334,
82
+ "f1": 0.125
83
+ },
84
+ "country": {
85
+ "precision": 0.28795180722891567,
86
+ "recall": 0.8250863060989643,
87
+ "f1": 0.4269127716582316
88
+ },
89
+ "local_latlng": {
90
+ "precision": 0.09944751381215469,
91
+ "recall": 0.07725321888412018,
92
+ "f1": 0.08695652173913043
93
+ },
94
+ "date": {
95
+ "precision": 0.11542936932743775,
96
+ "recall": 0.3599419448476052,
97
+ "f1": 0.17480176211453743
98
+ },
99
+ "time": {
100
+ "precision": 0.05263157894736842,
101
+ "recall": 0.14516129032258066,
102
+ "f1": 0.07725321888412016
103
+ },
104
+ "date_time": {
105
+ "precision": 0.0,
106
+ "recall": 0.0,
107
+ "f1": 0.0
108
+ },
109
+ "date_of_birth": {
110
+ "precision": 0.29508196721311475,
111
+ "recall": 0.027692307692307693,
112
+ "f1": 0.05063291139240506
113
+ },
114
+ "age": {
115
+ "precision": 0.10869565217391304,
116
+ "recall": 0.05747126436781609,
117
+ "f1": 0.07518796992481203
118
+ },
119
+ "gender": {
120
+ "precision": 0.0,
121
+ "recall": 0.0,
122
+ "f1": 0.0
123
+ },
124
+ "social_security_number": {
125
+ "precision": 0.0,
126
+ "recall": 0.0,
127
+ "f1": 0.0
128
+ },
129
+ "national_id": {
130
+ "precision": 0.0,
131
+ "recall": 0.0,
132
+ "f1": 0.0
133
+ },
134
+ "passport_number": {
135
+ "precision": 0.0,
136
+ "recall": 0.0,
137
+ "f1": 0.0
138
+ },
139
+ "driver_license_number": {
140
+ "precision": 0.0,
141
+ "recall": 0.0,
142
+ "f1": 0.0
143
+ },
144
+ "certificate_license_number": {
145
+ "precision": 0.0,
146
+ "recall": 0.0,
147
+ "f1": 0.0
148
+ },
149
+ "tax_id": {
150
+ "precision": 0.0,
151
+ "recall": 0.0,
152
+ "f1": 0.0
153
+ },
154
+ "bank_routing_number": {
155
+ "precision": 0.0,
156
+ "recall": 0.0,
157
+ "f1": 0.0
158
+ },
159
+ "iban": {
160
+ "precision": 0.0,
161
+ "recall": 0.0,
162
+ "f1": 0.0
163
+ },
164
+ "bban": {
165
+ "precision": 0.0,
166
+ "recall": 0.0,
167
+ "f1": 0.0
168
+ },
169
+ "swift_bic_code": {
170
+ "precision": 0.0,
171
+ "recall": 0.0,
172
+ "f1": 0.0
173
+ },
174
+ "account_number": {
175
+ "precision": 0.0,
176
+ "recall": 0.0,
177
+ "f1": 0.0
178
+ },
179
+ "customer_id": {
180
+ "precision": 0.0,
181
+ "recall": 0.0,
182
+ "f1": 0.0
183
+ },
184
+ "employee_id": {
185
+ "precision": 0.0,
186
+ "recall": 0.0,
187
+ "f1": 0.0
188
+ },
189
+ "student_id": {
190
+ "precision": 0.0,
191
+ "recall": 0.0,
192
+ "f1": 0.0
193
+ },
194
+ "patient_id": {
195
+ "precision": 0.0,
196
+ "recall": 0.0,
197
+ "f1": 0.0
198
+ },
199
+ "unique_identifier": {
200
+ "precision": 0.0,
201
+ "recall": 0.0,
202
+ "f1": 0.0
203
+ },
204
+ "api_key": {
205
+ "precision": 0.0,
206
+ "recall": 0.0,
207
+ "f1": 0.0
208
+ },
209
+ "access_token": {
210
+ "precision": 0.0,
211
+ "recall": 0.0,
212
+ "f1": 0.0
213
+ },
214
+ "password": {
215
+ "precision": 0.0,
216
+ "recall": 0.0,
217
+ "f1": 0.0
218
+ },
219
+ "pin": {
220
+ "precision": 0.0,
221
+ "recall": 0.0,
222
+ "f1": 0.0
223
+ },
224
+ "ipv4": {
225
+ "precision": 0.0,
226
+ "recall": 0.0,
227
+ "f1": 0.0
228
+ },
229
+ "ipv6": {
230
+ "precision": 0.0,
231
+ "recall": 0.0,
232
+ "f1": 0.0
233
+ },
234
+ "ip_address": {
235
+ "precision": 0.0,
236
+ "recall": 0.0,
237
+ "f1": 0.0
238
+ },
239
+ "mac_address": {
240
+ "precision": 0.0,
241
+ "recall": 0.0,
242
+ "f1": 0.0
243
+ },
244
+ "device_id": {
245
+ "precision": 0.034482758620689655,
246
+ "recall": 0.09090909090909091,
247
+ "f1": 0.05
248
+ },
249
+ "imei": {
250
+ "precision": 0.0,
251
+ "recall": 0.0,
252
+ "f1": 0.0
253
+ },
254
+ "imsi": {
255
+ "precision": 0.0,
256
+ "recall": 0.0,
257
+ "f1": 0.0
258
+ },
259
+ "vehicle_vin": {
260
+ "precision": 0.047619047619047616,
261
+ "recall": 0.09090909090909091,
262
+ "f1": 0.0625
263
+ },
264
+ "license_plate": {
265
+ "precision": 0.0,
266
+ "recall": 0.0,
267
+ "f1": 0.0
268
+ },
269
+ "credit_card_number": {
270
+ "precision": 0.0,
271
+ "recall": 0.0,
272
+ "f1": 0.0
273
+ },
274
+ "credit_card_security_code": {
275
+ "precision": 0.0,
276
+ "recall": 0.0,
277
+ "f1": 0.0
278
+ },
279
+ "credit_card_expiration": {
280
+ "precision": 0.0,
281
+ "recall": 0.0,
282
+ "f1": 0.0
283
+ },
284
+ "cardholder_name": {
285
+ "precision": 0.0,
286
+ "recall": 0.0,
287
+ "f1": 0.0
288
+ },
289
+ "card_brand": {
290
+ "precision": 0.0,
291
+ "recall": 0.0,
292
+ "f1": 0.0
293
+ },
294
+ "company": {
295
+ "precision": 0.34114888628370454,
296
+ "recall": 0.5384457236842105,
297
+ "f1": 0.41767004226138266
298
+ },
299
+ "organization": {
300
+ "precision": 0.17803721269609632,
301
+ "recall": 0.29701765063907487,
302
+ "f1": 0.22262773722627738
303
+ },
304
+ "job_title": {
305
+ "precision": 0.1206896551724138,
306
+ "recall": 0.029288702928870293,
307
+ "f1": 0.04713804713804714
308
+ },
309
+ "occupation": {
310
+ "precision": 0.0,
311
+ "recall": 0.0,
312
+ "f1": 0.0
313
+ },
314
+ "education_level": {
315
+ "precision": 0.0,
316
+ "recall": 0.0,
317
+ "f1": 0.0
318
+ },
319
+ "employment_status": {
320
+ "precision": 0.0,
321
+ "recall": 0.0,
322
+ "f1": 0.0
323
+ },
324
+ "url": {
325
+ "precision": 0.3972602739726027,
326
+ "recall": 0.07571801566579635,
327
+ "f1": 0.12719298245614036
328
+ },
329
+ "http_cookie": {
330
+ "precision": 0.0,
331
+ "recall": 0.0,
332
+ "f1": 0.0
333
+ },
334
+ "language": {
335
+ "precision": 0.06274509803921569,
336
+ "recall": 0.21052631578947367,
337
+ "f1": 0.09667673716012085
338
+ },
339
+ "medical_record_number": {
340
+ "precision": 0.2857142857142857,
341
+ "recall": 0.0273972602739726,
342
+ "f1": 0.05
343
+ },
344
+ "health_plan_beneficiary_number": {
345
+ "precision": 0.0,
346
+ "recall": 0.0,
347
+ "f1": 0.0
348
+ },
349
+ "insurance_id": {
350
+ "precision": 0.0,
351
+ "recall": 0.0,
352
+ "f1": 0.0
353
+ },
354
+ "provider_name": {
355
+ "precision": 0.0,
356
+ "recall": 0.0,
357
+ "f1": 0.0
358
+ },
359
+ "hospital_name": {
360
+ "precision": 0.19672131147540983,
361
+ "recall": 0.7058823529411765,
362
+ "f1": 0.30769230769230765
363
+ },
364
+ "diagnosis": {
365
+ "precision": 0.0472972972972973,
366
+ "recall": 0.3181818181818182,
367
+ "f1": 0.08235294117647059
368
+ },
369
+ "procedure": {
370
+ "precision": 0.14432989690721648,
371
+ "recall": 0.3684210526315789,
372
+ "f1": 0.20740740740740737
373
+ },
374
+ "medication": {
375
+ "precision": 0.006024096385542169,
376
+ "recall": 0.1,
377
+ "f1": 0.011363636363636364
378
+ },
379
+ "lab_result": {
380
+ "precision": 0.0,
381
+ "recall": 0.0,
382
+ "f1": 0.0
383
+ },
384
+ "admission_date": {
385
+ "precision": 0.0,
386
+ "recall": 0.0,
387
+ "f1": 0.0
388
+ },
389
+ "discharge_date": {
390
+ "precision": 0.0,
391
+ "recall": 0.0,
392
+ "f1": 0.0
393
+ },
394
+ "room_number": {
395
+ "precision": 0.0,
396
+ "recall": 0.0,
397
+ "f1": 0.0
398
+ },
399
+ "blood_type": {
400
+ "precision": 0.0,
401
+ "recall": 0.0,
402
+ "f1": 0.0
403
+ },
404
+ "biometric_identifier": {
405
+ "precision": 0.0,
406
+ "recall": 0.0,
407
+ "f1": 0.0
408
+ },
409
+ "race_ethnicity": {
410
+ "precision": 0.04477611940298507,
411
+ "recall": 0.2,
412
+ "f1": 0.07317073170731708
413
+ },
414
+ "religious_belief": {
415
+ "precision": 0.0,
416
+ "recall": 0.0,
417
+ "f1": 0.0
418
+ },
419
+ "political_view": {
420
+ "precision": 0.0,
421
+ "recall": 0.0,
422
+ "f1": 0.0
423
+ },
424
+ "sexual_orientation": {
425
+ "precision": 0.0,
426
+ "recall": 0.0,
427
+ "f1": 0.0
428
+ },
429
+ "product": {
430
+ "precision": 0.1928020565552699,
431
+ "recall": 0.09351620947630923,
432
+ "f1": 0.12594458438287154
433
+ },
434
+ "event": {
435
+ "precision": 0.1327683615819209,
436
+ "recall": 0.14826498422712933,
437
+ "f1": 0.14008941877794334
438
+ },
439
+ "facility": {
440
+ "precision": 0.12393162393162394,
441
+ "recall": 0.19594594594594594,
442
+ "f1": 0.1518324607329843
443
+ },
444
+ "law": {
445
+ "precision": 0.2972972972972973,
446
+ "recall": 0.43137254901960786,
447
+ "f1": 0.35200000000000004
448
+ },
449
+ "work_of_art": {
450
+ "precision": 0.014925373134328358,
451
+ "recall": 0.03636363636363636,
452
+ "f1": 0.021164021164021163
453
+ }
454
+ },
455
+ "by_language": {
456
+ "Vietnamese": {
457
+ "precision": 0.32107843137254904,
458
+ "recall": 0.2525706940874036,
459
+ "f1": 0.2827338129496403
460
+ },
461
+ "Indonesian": {
462
+ "precision": 0.2577040298002032,
463
+ "recall": 0.2760246644903881,
464
+ "f1": 0.26654991243432574
465
+ },
466
+ "Korean": {
467
+ "precision": 0.25502008032128515,
468
+ "recall": 0.24517374517374518,
469
+ "f1": 0.25
470
+ },
471
+ "Italian": {
472
+ "precision": 0.2878787878787879,
473
+ "recall": 0.30297661233167966,
474
+ "f1": 0.29523480662983426
475
+ },
476
+ "English": {
477
+ "precision": 0.2993006993006993,
478
+ "recall": 0.3868696479543292,
479
+ "f1": 0.3374974061008508
480
+ },
481
+ "Japanese": {
482
+ "precision": 0.24390243902439024,
483
+ "recall": 0.22966507177033493,
484
+ "f1": 0.23656973878758009
485
+ },
486
+ "Chinese": {
487
+ "precision": 0.18231336110221083,
488
+ "recall": 0.21577550246492225,
489
+ "f1": 0.1976380687738798
490
+ },
491
+ "German": {
492
+ "precision": 0.27699917559769166,
493
+ "recall": 0.32010161956176564,
494
+ "f1": 0.2969946965232763
495
+ },
496
+ "French": {
497
+ "precision": 0.27207062600321025,
498
+ "recall": 0.25808907499048345,
499
+ "f1": 0.2648954873998828
500
+ },
501
+ "Dutch": {
502
+ "precision": 0.2728512960436562,
503
+ "recall": 0.29806259314456035,
504
+ "f1": 0.28490028490028485
505
+ },
506
+ "Swedish": {
507
+ "precision": 0.2933467741935484,
508
+ "recall": 0.3446846313295825,
509
+ "f1": 0.3169503063308373
510
+ },
511
+ "Spanish": {
512
+ "precision": 0.33671065032987746,
513
+ "recall": 0.33822485207100594,
514
+ "f1": 0.33746605266265206
515
+ }
516
+ }
517
+ },
518
+ "urchade/gliner_multi-v2.1": {
519
+ "overall": {
520
+ "precision": 0.17073170731707318,
521
+ "recall": 0.0005539873901917852,
522
+ "f1": 0.0011043912700499606
523
+ },
524
+ "macro": {
525
+ "precision": 0.0028704209950792784,
526
+ "recall": 0.00019488473959686697,
527
+ "f1": 0.00036498887652947716
528
+ },
529
+ "by_label": {
530
+ "name": {
531
+ "precision": 0.0,
532
+ "recall": 0.0,
533
+ "f1": 0.0
534
+ },
535
+ "first_name": {
536
+ "precision": 0.0,
537
+ "recall": 0.0,
538
+ "f1": 0.0
539
+ },
540
+ "middle_name": {
541
+ "precision": 0.0,
542
+ "recall": 0.0,
543
+ "f1": 0.0
544
+ },
545
+ "last_name": {
546
+ "precision": 0.0,
547
+ "recall": 0.0,
548
+ "f1": 0.0
549
+ },
550
+ "full_name": {
551
+ "precision": 0.0,
552
+ "recall": 0.0,
553
+ "f1": 0.0
554
+ },
555
+ "user_name": {
556
+ "precision": 0.0,
557
+ "recall": 0.0,
558
+ "f1": 0.0
559
+ },
560
+ "email": {
561
+ "precision": 0.0,
562
+ "recall": 0.0,
563
+ "f1": 0.0
564
+ },
565
+ "phone_number": {
566
+ "precision": 0.0,
567
+ "recall": 0.0,
568
+ "f1": 0.0
569
+ },
570
+ "fax_number": {
571
+ "precision": 0.0,
572
+ "recall": 0.0,
573
+ "f1": 0.0
574
+ },
575
+ "street_address": {
576
+ "precision": 0.0,
577
+ "recall": 0.0,
578
+ "f1": 0.0
579
+ },
580
+ "city": {
581
+ "precision": 0.0,
582
+ "recall": 0.0,
583
+ "f1": 0.0
584
+ },
585
+ "state": {
586
+ "precision": 0.0,
587
+ "recall": 0.0,
588
+ "f1": 0.0
589
+ },
590
+ "county": {
591
+ "precision": 0.0,
592
+ "recall": 0.0,
593
+ "f1": 0.0
594
+ },
595
+ "postal_code": {
596
+ "precision": 0.0,
597
+ "recall": 0.0,
598
+ "f1": 0.0
599
+ },
600
+ "country": {
601
+ "precision": 0.17796610169491525,
602
+ "recall": 0.012082853855005753,
603
+ "f1": 0.022629310344827583
604
+ },
605
+ "local_latlng": {
606
+ "precision": 0.0,
607
+ "recall": 0.0,
608
+ "f1": 0.0
609
+ },
610
+ "date": {
611
+ "precision": 0.0,
612
+ "recall": 0.0,
613
+ "f1": 0.0
614
+ },
615
+ "time": {
616
+ "precision": 0.0,
617
+ "recall": 0.0,
618
+ "f1": 0.0
619
+ },
620
+ "date_time": {
621
+ "precision": 0.0,
622
+ "recall": 0.0,
623
+ "f1": 0.0
624
+ },
625
+ "date_of_birth": {
626
+ "precision": 0.0,
627
+ "recall": 0.0,
628
+ "f1": 0.0
629
+ },
630
+ "age": {
631
+ "precision": 0.0,
632
+ "recall": 0.0,
633
+ "f1": 0.0
634
+ },
635
+ "gender": {
636
+ "precision": 0.0,
637
+ "recall": 0.0,
638
+ "f1": 0.0
639
+ },
640
+ "social_security_number": {
641
+ "precision": 0.0,
642
+ "recall": 0.0,
643
+ "f1": 0.0
644
+ },
645
+ "national_id": {
646
+ "precision": 0.0,
647
+ "recall": 0.0,
648
+ "f1": 0.0
649
+ },
650
+ "passport_number": {
651
+ "precision": 0.0,
652
+ "recall": 0.0,
653
+ "f1": 0.0
654
+ },
655
+ "driver_license_number": {
656
+ "precision": 0.0,
657
+ "recall": 0.0,
658
+ "f1": 0.0
659
+ },
660
+ "certificate_license_number": {
661
+ "precision": 0.0,
662
+ "recall": 0.0,
663
+ "f1": 0.0
664
+ },
665
+ "tax_id": {
666
+ "precision": 0.0,
667
+ "recall": 0.0,
668
+ "f1": 0.0
669
+ },
670
+ "bank_routing_number": {
671
+ "precision": 0.0,
672
+ "recall": 0.0,
673
+ "f1": 0.0
674
+ },
675
+ "iban": {
676
+ "precision": 0.0,
677
+ "recall": 0.0,
678
+ "f1": 0.0
679
+ },
680
+ "bban": {
681
+ "precision": 0.0,
682
+ "recall": 0.0,
683
+ "f1": 0.0
684
+ },
685
+ "swift_bic_code": {
686
+ "precision": 0.0,
687
+ "recall": 0.0,
688
+ "f1": 0.0
689
+ },
690
+ "account_number": {
691
+ "precision": 0.0,
692
+ "recall": 0.0,
693
+ "f1": 0.0
694
+ },
695
+ "customer_id": {
696
+ "precision": 0.0,
697
+ "recall": 0.0,
698
+ "f1": 0.0
699
+ },
700
+ "employee_id": {
701
+ "precision": 0.0,
702
+ "recall": 0.0,
703
+ "f1": 0.0
704
+ },
705
+ "student_id": {
706
+ "precision": 0.0,
707
+ "recall": 0.0,
708
+ "f1": 0.0
709
+ },
710
+ "patient_id": {
711
+ "precision": 0.0,
712
+ "recall": 0.0,
713
+ "f1": 0.0
714
+ },
715
+ "unique_identifier": {
716
+ "precision": 0.0,
717
+ "recall": 0.0,
718
+ "f1": 0.0
719
+ },
720
+ "api_key": {
721
+ "precision": 0.0,
722
+ "recall": 0.0,
723
+ "f1": 0.0
724
+ },
725
+ "access_token": {
726
+ "precision": 0.0,
727
+ "recall": 0.0,
728
+ "f1": 0.0
729
+ },
730
+ "password": {
731
+ "precision": 0.0,
732
+ "recall": 0.0,
733
+ "f1": 0.0
734
+ },
735
+ "pin": {
736
+ "precision": 0.0,
737
+ "recall": 0.0,
738
+ "f1": 0.0
739
+ },
740
+ "ipv4": {
741
+ "precision": 0.0,
742
+ "recall": 0.0,
743
+ "f1": 0.0
744
+ },
745
+ "ipv6": {
746
+ "precision": 0.0,
747
+ "recall": 0.0,
748
+ "f1": 0.0
749
+ },
750
+ "ip_address": {
751
+ "precision": 0.0,
752
+ "recall": 0.0,
753
+ "f1": 0.0
754
+ },
755
+ "mac_address": {
756
+ "precision": 0.0,
757
+ "recall": 0.0,
758
+ "f1": 0.0
759
+ },
760
+ "device_id": {
761
+ "precision": 0.0,
762
+ "recall": 0.0,
763
+ "f1": 0.0
764
+ },
765
+ "imei": {
766
+ "precision": 0.0,
767
+ "recall": 0.0,
768
+ "f1": 0.0
769
+ },
770
+ "imsi": {
771
+ "precision": 0.0,
772
+ "recall": 0.0,
773
+ "f1": 0.0
774
+ },
775
+ "vehicle_vin": {
776
+ "precision": 0.0,
777
+ "recall": 0.0,
778
+ "f1": 0.0
779
+ },
780
+ "license_plate": {
781
+ "precision": 0.0,
782
+ "recall": 0.0,
783
+ "f1": 0.0
784
+ },
785
+ "credit_card_number": {
786
+ "precision": 0.0,
787
+ "recall": 0.0,
788
+ "f1": 0.0
789
+ },
790
+ "credit_card_security_code": {
791
+ "precision": 0.0,
792
+ "recall": 0.0,
793
+ "f1": 0.0
794
+ },
795
+ "credit_card_expiration": {
796
+ "precision": 0.0,
797
+ "recall": 0.0,
798
+ "f1": 0.0
799
+ },
800
+ "cardholder_name": {
801
+ "precision": 0.0,
802
+ "recall": 0.0,
803
+ "f1": 0.0
804
+ },
805
+ "card_brand": {
806
+ "precision": 0.0,
807
+ "recall": 0.0,
808
+ "f1": 0.0
809
+ },
810
+ "company": {
811
+ "precision": 0.0,
812
+ "recall": 0.0,
813
+ "f1": 0.0
814
+ },
815
+ "organization": {
816
+ "precision": 0.0,
817
+ "recall": 0.0,
818
+ "f1": 0.0
819
+ },
820
+ "job_title": {
821
+ "precision": 0.0,
822
+ "recall": 0.0,
823
+ "f1": 0.0
824
+ },
825
+ "occupation": {
826
+ "precision": 0.0,
827
+ "recall": 0.0,
828
+ "f1": 0.0
829
+ },
830
+ "education_level": {
831
+ "precision": 0.0,
832
+ "recall": 0.0,
833
+ "f1": 0.0
834
+ },
835
+ "employment_status": {
836
+ "precision": 0.0,
837
+ "recall": 0.0,
838
+ "f1": 0.0
839
+ },
840
+ "url": {
841
+ "precision": 0.0,
842
+ "recall": 0.0,
843
+ "f1": 0.0
844
+ },
845
+ "http_cookie": {
846
+ "precision": 0.0,
847
+ "recall": 0.0,
848
+ "f1": 0.0
849
+ },
850
+ "language": {
851
+ "precision": 0.0,
852
+ "recall": 0.0,
853
+ "f1": 0.0
854
+ },
855
+ "medical_record_number": {
856
+ "precision": 0.0,
857
+ "recall": 0.0,
858
+ "f1": 0.0
859
+ },
860
+ "health_plan_beneficiary_number": {
861
+ "precision": 0.0,
862
+ "recall": 0.0,
863
+ "f1": 0.0
864
+ },
865
+ "insurance_id": {
866
+ "precision": 0.0,
867
+ "recall": 0.0,
868
+ "f1": 0.0
869
+ },
870
+ "provider_name": {
871
+ "precision": 0.0,
872
+ "recall": 0.0,
873
+ "f1": 0.0
874
+ },
875
+ "hospital_name": {
876
+ "precision": 0.0,
877
+ "recall": 0.0,
878
+ "f1": 0.0
879
+ },
880
+ "diagnosis": {
881
+ "precision": 0.0,
882
+ "recall": 0.0,
883
+ "f1": 0.0
884
+ },
885
+ "procedure": {
886
+ "precision": 0.0,
887
+ "recall": 0.0,
888
+ "f1": 0.0
889
+ },
890
+ "medication": {
891
+ "precision": 0.0,
892
+ "recall": 0.0,
893
+ "f1": 0.0
894
+ },
895
+ "lab_result": {
896
+ "precision": 0.0,
897
+ "recall": 0.0,
898
+ "f1": 0.0
899
+ },
900
+ "admission_date": {
901
+ "precision": 0.0,
902
+ "recall": 0.0,
903
+ "f1": 0.0
904
+ },
905
+ "discharge_date": {
906
+ "precision": 0.0,
907
+ "recall": 0.0,
908
+ "f1": 0.0
909
+ },
910
+ "room_number": {
911
+ "precision": 0.0,
912
+ "recall": 0.0,
913
+ "f1": 0.0
914
+ },
915
+ "blood_type": {
916
+ "precision": 0.0,
917
+ "recall": 0.0,
918
+ "f1": 0.0
919
+ },
920
+ "biometric_identifier": {
921
+ "precision": 0.0,
922
+ "recall": 0.0,
923
+ "f1": 0.0
924
+ },
925
+ "race_ethnicity": {
926
+ "precision": 0.0,
927
+ "recall": 0.0,
928
+ "f1": 0.0
929
+ },
930
+ "religious_belief": {
931
+ "precision": 0.0,
932
+ "recall": 0.0,
933
+ "f1": 0.0
934
+ },
935
+ "political_view": {
936
+ "precision": 0.0,
937
+ "recall": 0.0,
938
+ "f1": 0.0
939
+ },
940
+ "sexual_orientation": {
941
+ "precision": 0.0,
942
+ "recall": 0.0,
943
+ "f1": 0.0
944
+ },
945
+ "product": {
946
+ "precision": 0.0,
947
+ "recall": 0.0,
948
+ "f1": 0.0
949
+ },
950
+ "event": {
951
+ "precision": 0.0,
952
+ "recall": 0.0,
953
+ "f1": 0.0
954
+ },
955
+ "facility": {
956
+ "precision": 0.0,
957
+ "recall": 0.0,
958
+ "f1": 0.0
959
+ },
960
+ "law": {
961
+ "precision": 0.0,
962
+ "recall": 0.0,
963
+ "f1": 0.0
964
+ },
965
+ "work_of_art": {
966
+ "precision": 0.0,
967
+ "recall": 0.0,
968
+ "f1": 0.0
969
+ }
970
+ },
971
+ "by_language": {
972
+ "Vietnamese": {
973
+ "precision": 0.06666666666666667,
974
+ "recall": 0.0006426735218508997,
975
+ "f1": 0.001273074474856779
976
+ },
977
+ "Indonesian": {
978
+ "precision": 0.26666666666666666,
979
+ "recall": 0.0014508523757707653,
980
+ "f1": 0.002886002886002886
981
+ },
982
+ "Korean": {
983
+ "precision": 0.0,
984
+ "recall": 0.0,
985
+ "f1": 0.0
986
+ },
987
+ "Italian": {
988
+ "precision": 0.0,
989
+ "recall": 0.0,
990
+ "f1": 0.0
991
+ },
992
+ "English": {
993
+ "precision": 0.14705882352941177,
994
+ "recall": 0.0004757373929590866,
995
+ "f1": 0.0009484066767830046
996
+ },
997
+ "Japanese": {
998
+ "precision": 0.5,
999
+ "recall": 0.0009569377990430622,
1000
+ "f1": 0.0019102196752626554
1001
+ },
1002
+ "Chinese": {
1003
+ "precision": 0.5,
1004
+ "recall": 0.0003792188092529389,
1005
+ "f1": 0.000757862826828344
1006
+ },
1007
+ "German": {
1008
+ "precision": 0.16666666666666666,
1009
+ "recall": 0.00031756113051762465,
1010
+ "f1": 0.0006339144215530904
1011
+ },
1012
+ "French": {
1013
+ "precision": 0.0,
1014
+ "recall": 0.0,
1015
+ "f1": 0.0
1016
+ },
1017
+ "Dutch": {
1018
+ "precision": 0.0,
1019
+ "recall": 0.0,
1020
+ "f1": 0.0
1021
+ },
1022
+ "Swedish": {
1023
+ "precision": 0.16666666666666666,
1024
+ "recall": 0.0005922416345869114,
1025
+ "f1": 0.0011802891708468574
1026
+ },
1027
+ "Spanish": {
1028
+ "precision": 0.3333333333333333,
1029
+ "recall": 0.0014201183431952662,
1030
+ "f1": 0.0028281876031110063
1031
+ }
1032
+ }
1033
+ },
1034
+ "nvidia/gliner-PII": {
1035
+ "overall": {
1036
+ "precision": 0.08506944444444445,
1037
+ "recall": 0.2973065660695914,
1038
+ "f1": 0.13228708762992483
1039
+ },
1040
+ "macro": {
1041
+ "precision": 0.06690600080153934,
1042
+ "recall": 0.20311586277674873,
1043
+ "f1": 0.07380013641963948
1044
+ },
1045
+ "by_label": {
1046
+ "name": {
1047
+ "precision": 0.02295918367346939,
1048
+ "recall": 0.04155672823218998,
1049
+ "f1": 0.029577464788732397
1050
+ },
1051
+ "first_name": {
1052
+ "precision": 0.0343691733996271,
1053
+ "recall": 0.2161845191555903,
1054
+ "f1": 0.05930930930930931
1055
+ },
1056
+ "middle_name": {
1057
+ "precision": 0.0,
1058
+ "recall": 0.0,
1059
+ "f1": 0.0
1060
+ },
1061
+ "last_name": {
1062
+ "precision": 0.009121590805436468,
1063
+ "recall": 0.6269592476489029,
1064
+ "f1": 0.017981568891885815
1065
+ },
1066
+ "full_name": {
1067
+ "precision": 0.0,
1068
+ "recall": 0.0,
1069
+ "f1": 0.0
1070
+ },
1071
+ "user_name": {
1072
+ "precision": 0.07780320366132723,
1073
+ "recall": 0.29694323144104806,
1074
+ "f1": 0.12330009066183137
1075
+ },
1076
+ "email": {
1077
+ "precision": 0.5918367346938775,
1078
+ "recall": 0.7837837837837838,
1079
+ "f1": 0.6744186046511628
1080
+ },
1081
+ "phone_number": {
1082
+ "precision": 0.4972972972972973,
1083
+ "recall": 0.5411764705882353,
1084
+ "f1": 0.5183098591549296
1085
+ },
1086
+ "fax_number": {
1087
+ "precision": 0.14285714285714285,
1088
+ "recall": 1.0,
1089
+ "f1": 0.25
1090
+ },
1091
+ "street_address": {
1092
+ "precision": 0.16633663366336635,
1093
+ "recall": 0.23076923076923078,
1094
+ "f1": 0.1933256616800921
1095
+ },
1096
+ "city": {
1097
+ "precision": 0.20156925063316283,
1098
+ "recall": 0.7245626561942163,
1099
+ "f1": 0.3153968685652123
1100
+ },
1101
+ "state": {
1102
+ "precision": 0.13827711179258434,
1103
+ "recall": 0.897737556561086,
1104
+ "f1": 0.2396424688972098
1105
+ },
1106
+ "county": {
1107
+ "precision": 0.24772036474164133,
1108
+ "recall": 0.2156084656084656,
1109
+ "f1": 0.23055162659123052
1110
+ },
1111
+ "postal_code": {
1112
+ "precision": 0.15432098765432098,
1113
+ "recall": 0.3333333333333333,
1114
+ "f1": 0.2109704641350211
1115
+ },
1116
+ "country": {
1117
+ "precision": 0.17802168384419756,
1118
+ "recall": 0.7652474108170311,
1119
+ "f1": 0.2888478662178304
1120
+ },
1121
+ "local_latlng": {
1122
+ "precision": 0.12195121951219512,
1123
+ "recall": 0.02145922746781116,
1124
+ "f1": 0.0364963503649635
1125
+ },
1126
+ "date": {
1127
+ "precision": 0.06562155348983773,
1128
+ "recall": 0.604499274310595,
1129
+ "f1": 0.11839113132461627
1130
+ },
1131
+ "time": {
1132
+ "precision": 0.02748930971288943,
1133
+ "recall": 0.7258064516129032,
1134
+ "f1": 0.052972336668628606
1135
+ },
1136
+ "date_time": {
1137
+ "precision": 0.058823529411764705,
1138
+ "recall": 0.02027027027027027,
1139
+ "f1": 0.030150753768844223
1140
+ },
1141
+ "date_of_birth": {
1142
+ "precision": 0.1283625730994152,
1143
+ "recall": 0.6753846153846154,
1144
+ "f1": 0.21572481572481572
1145
+ },
1146
+ "age": {
1147
+ "precision": 0.014727540500736377,
1148
+ "recall": 0.22988505747126436,
1149
+ "f1": 0.02768166089965398
1150
+ },
1151
+ "gender": {
1152
+ "precision": 0.016494845360824743,
1153
+ "recall": 0.5714285714285714,
1154
+ "f1": 0.03206412825651302
1155
+ },
1156
+ "social_security_number": {
1157
+ "precision": 0.01818181818181818,
1158
+ "recall": 0.5,
1159
+ "f1": 0.03508771929824561
1160
+ },
1161
+ "national_id": {
1162
+ "precision": 0.0,
1163
+ "recall": 0.0,
1164
+ "f1": 0.0
1165
+ },
1166
+ "passport_number": {
1167
+ "precision": 0.0,
1168
+ "recall": 0.0,
1169
+ "f1": 0.0
1170
+ },
1171
+ "driver_license_number": {
1172
+ "precision": 0.0,
1173
+ "recall": 0.0,
1174
+ "f1": 0.0
1175
+ },
1176
+ "certificate_license_number": {
1177
+ "precision": 0.0,
1178
+ "recall": 0.0,
1179
+ "f1": 0.0
1180
+ },
1181
+ "tax_id": {
1182
+ "precision": 0.0,
1183
+ "recall": 0.0,
1184
+ "f1": 0.0
1185
+ },
1186
+ "bank_routing_number": {
1187
+ "precision": 0.0,
1188
+ "recall": 0.0,
1189
+ "f1": 0.0
1190
+ },
1191
+ "iban": {
1192
+ "precision": 0.0,
1193
+ "recall": 0.0,
1194
+ "f1": 0.0
1195
+ },
1196
+ "bban": {
1197
+ "precision": 0.0,
1198
+ "recall": 0.0,
1199
+ "f1": 0.0
1200
+ },
1201
+ "swift_bic_code": {
1202
+ "precision": 0.0,
1203
+ "recall": 0.0,
1204
+ "f1": 0.0
1205
+ },
1206
+ "account_number": {
1207
+ "precision": 0.0,
1208
+ "recall": 0.0,
1209
+ "f1": 0.0
1210
+ },
1211
+ "customer_id": {
1212
+ "precision": 0.0,
1213
+ "recall": 0.0,
1214
+ "f1": 0.0
1215
+ },
1216
+ "employee_id": {
1217
+ "precision": 0.0,
1218
+ "recall": 0.0,
1219
+ "f1": 0.0
1220
+ },
1221
+ "student_id": {
1222
+ "precision": 0.0,
1223
+ "recall": 0.0,
1224
+ "f1": 0.0
1225
+ },
1226
+ "patient_id": {
1227
+ "precision": 0.0,
1228
+ "recall": 0.0,
1229
+ "f1": 0.0
1230
+ },
1231
+ "unique_identifier": {
1232
+ "precision": 0.023255813953488372,
1233
+ "recall": 0.0045045045045045045,
1234
+ "f1": 0.007547169811320755
1235
+ },
1236
+ "api_key": {
1237
+ "precision": 0.0,
1238
+ "recall": 0.0,
1239
+ "f1": 0.0
1240
+ },
1241
+ "access_token": {
1242
+ "precision": 0.0,
1243
+ "recall": 0.0,
1244
+ "f1": 0.0
1245
+ },
1246
+ "password": {
1247
+ "precision": 0.0,
1248
+ "recall": 0.0,
1249
+ "f1": 0.0
1250
+ },
1251
+ "pin": {
1252
+ "precision": 0.0,
1253
+ "recall": 0.0,
1254
+ "f1": 0.0
1255
+ },
1256
+ "ipv4": {
1257
+ "precision": 0.3333333333333333,
1258
+ "recall": 0.25,
1259
+ "f1": 0.28571428571428575
1260
+ },
1261
+ "ipv6": {
1262
+ "precision": 0.0,
1263
+ "recall": 0.0,
1264
+ "f1": 0.0
1265
+ },
1266
+ "ip_address": {
1267
+ "precision": 0.1,
1268
+ "recall": 0.5,
1269
+ "f1": 0.16666666666666669
1270
+ },
1271
+ "mac_address": {
1272
+ "precision": 0.0,
1273
+ "recall": 0.0,
1274
+ "f1": 0.0
1275
+ },
1276
+ "device_id": {
1277
+ "precision": 0.3333333333333333,
1278
+ "recall": 0.022727272727272728,
1279
+ "f1": 0.04255319148936171
1280
+ },
1281
+ "imei": {
1282
+ "precision": 0.0,
1283
+ "recall": 0.0,
1284
+ "f1": 0.0
1285
+ },
1286
+ "imsi": {
1287
+ "precision": 0.0,
1288
+ "recall": 0.0,
1289
+ "f1": 0.0
1290
+ },
1291
+ "vehicle_vin": {
1292
+ "precision": 0.06666666666666667,
1293
+ "recall": 0.09090909090909091,
1294
+ "f1": 0.07692307692307691
1295
+ },
1296
+ "license_plate": {
1297
+ "precision": 0.0,
1298
+ "recall": 0.0,
1299
+ "f1": 0.0
1300
+ },
1301
+ "credit_card_number": {
1302
+ "precision": 0.0,
1303
+ "recall": 0.0,
1304
+ "f1": 0.0
1305
+ },
1306
+ "credit_card_security_code": {
1307
+ "precision": 0.0,
1308
+ "recall": 0.0,
1309
+ "f1": 0.0
1310
+ },
1311
+ "credit_card_expiration": {
1312
+ "precision": 0.0,
1313
+ "recall": 0.0,
1314
+ "f1": 0.0
1315
+ },
1316
+ "cardholder_name": {
1317
+ "precision": 0.0,
1318
+ "recall": 0.0,
1319
+ "f1": 0.0
1320
+ },
1321
+ "card_brand": {
1322
+ "precision": 0.0,
1323
+ "recall": 0.0,
1324
+ "f1": 0.0
1325
+ },
1326
+ "company": {
1327
+ "precision": 0.3981790591805766,
1328
+ "recall": 0.26973684210526316,
1329
+ "f1": 0.32160804020100503
1330
+ },
1331
+ "organization": {
1332
+ "precision": 0.05963955418543988,
1333
+ "recall": 0.3061472915398661,
1334
+ "f1": 0.09983129899771757
1335
+ },
1336
+ "job_title": {
1337
+ "precision": 0.02165605095541401,
1338
+ "recall": 0.07112970711297072,
1339
+ "f1": 0.033203125
1340
+ },
1341
+ "occupation": {
1342
+ "precision": 0.0009998889012331964,
1343
+ "recall": 0.36,
1344
+ "f1": 0.0019942388654996678
1345
+ },
1346
+ "education_level": {
1347
+ "precision": 0.0010416666666666667,
1348
+ "recall": 0.25,
1349
+ "f1": 0.002074688796680498
1350
+ },
1351
+ "employment_status": {
1352
+ "precision": 0.0,
1353
+ "recall": 0.0,
1354
+ "f1": 0.0
1355
+ },
1356
+ "url": {
1357
+ "precision": 0.375,
1358
+ "recall": 0.1409921671018277,
1359
+ "f1": 0.2049335863377609
1360
+ },
1361
+ "http_cookie": {
1362
+ "precision": 0.0,
1363
+ "recall": 0.0,
1364
+ "f1": 0.0
1365
+ },
1366
+ "language": {
1367
+ "precision": 0.014800759013282733,
1368
+ "recall": 0.5131578947368421,
1369
+ "f1": 0.02877167097012173
1370
+ },
1371
+ "medical_record_number": {
1372
+ "precision": 0.0,
1373
+ "recall": 0.0,
1374
+ "f1": 0.0
1375
+ },
1376
+ "health_plan_beneficiary_number": {
1377
+ "precision": 0.0,
1378
+ "recall": 0.0,
1379
+ "f1": 0.0
1380
+ },
1381
+ "insurance_id": {
1382
+ "precision": 0.0,
1383
+ "recall": 0.0,
1384
+ "f1": 0.0
1385
+ },
1386
+ "provider_name": {
1387
+ "precision": 0.0,
1388
+ "recall": 0.0,
1389
+ "f1": 0.0
1390
+ },
1391
+ "hospital_name": {
1392
+ "precision": 0.18181818181818182,
1393
+ "recall": 0.47058823529411764,
1394
+ "f1": 0.26229508196721313
1395
+ },
1396
+ "diagnosis": {
1397
+ "precision": 0.0,
1398
+ "recall": 0.0,
1399
+ "f1": 0.0
1400
+ },
1401
+ "procedure": {
1402
+ "precision": 0.04878048780487805,
1403
+ "recall": 0.21052631578947367,
1404
+ "f1": 0.07920792079207921
1405
+ },
1406
+ "medication": {
1407
+ "precision": 0.006289308176100629,
1408
+ "recall": 0.1,
1409
+ "f1": 0.01183431952662722
1410
+ },
1411
+ "lab_result": {
1412
+ "precision": 0.0,
1413
+ "recall": 0.0,
1414
+ "f1": 0.0
1415
+ },
1416
+ "admission_date": {
1417
+ "precision": 0.0,
1418
+ "recall": 0.0,
1419
+ "f1": 0.0
1420
+ },
1421
+ "discharge_date": {
1422
+ "precision": 0.0,
1423
+ "recall": 0.0,
1424
+ "f1": 0.0
1425
+ },
1426
+ "room_number": {
1427
+ "precision": 0.0,
1428
+ "recall": 0.0,
1429
+ "f1": 0.0
1430
+ },
1431
+ "blood_type": {
1432
+ "precision": 0.0,
1433
+ "recall": 0.0,
1434
+ "f1": 0.0
1435
+ },
1436
+ "biometric_identifier": {
1437
+ "precision": 0.0,
1438
+ "recall": 0.0,
1439
+ "f1": 0.0
1440
+ },
1441
+ "race_ethnicity": {
1442
+ "precision": 0.003926306251887647,
1443
+ "recall": 0.8666666666666667,
1444
+ "f1": 0.007817197835237523
1445
+ },
1446
+ "religious_belief": {
1447
+ "precision": 0.0,
1448
+ "recall": 0.0,
1449
+ "f1": 0.0
1450
+ },
1451
+ "political_view": {
1452
+ "precision": 0.0018867924528301887,
1453
+ "recall": 0.5,
1454
+ "f1": 0.0037593984962406017
1455
+ },
1456
+ "sexual_orientation": {
1457
+ "precision": 0.0,
1458
+ "recall": 0.0,
1459
+ "f1": 0.0
1460
+ },
1461
+ "product": {
1462
+ "precision": 0.08724202626641651,
1463
+ "recall": 0.11596009975062344,
1464
+ "f1": 0.09957173447537472
1465
+ },
1466
+ "event": {
1467
+ "precision": 0.03706030150753769,
1468
+ "recall": 0.1861198738170347,
1469
+ "f1": 0.061812467260345734
1470
+ },
1471
+ "facility": {
1472
+ "precision": 0.10152284263959391,
1473
+ "recall": 0.06756756756756757,
1474
+ "f1": 0.08113590263691683
1475
+ },
1476
+ "law": {
1477
+ "precision": 0.1036036036036036,
1478
+ "recall": 0.45098039215686275,
1479
+ "f1": 0.16849816849816848
1480
+ },
1481
+ "work_of_art": {
1482
+ "precision": 0.004489337822671156,
1483
+ "recall": 0.07272727272727272,
1484
+ "f1": 0.008456659619450317
1485
+ }
1486
+ },
1487
+ "by_language": {
1488
+ "Vietnamese": {
1489
+ "precision": 0.07543374402816193,
1490
+ "recall": 0.1928020565552699,
1491
+ "f1": 0.10844026748599313
1492
+ },
1493
+ "Indonesian": {
1494
+ "precision": 0.06713030039602047,
1495
+ "recall": 0.25208560029017046,
1496
+ "f1": 0.10602593440122043
1497
+ },
1498
+ "Korean": {
1499
+ "precision": 0.07187112763320942,
1500
+ "recall": 0.22393822393822393,
1501
+ "f1": 0.10881801125703566
1502
+ },
1503
+ "Italian": {
1504
+ "precision": 0.07354337505394908,
1505
+ "recall": 0.30191353649893693,
1506
+ "f1": 0.1182758381342403
1507
+ },
1508
+ "English": {
1509
+ "precision": 0.11350557839107457,
1510
+ "recall": 0.3678401522359657,
1511
+ "f1": 0.1734799192281804
1512
+ },
1513
+ "Japanese": {
1514
+ "precision": 0.08264786525656091,
1515
+ "recall": 0.20191387559808613,
1516
+ "f1": 0.11728738187882157
1517
+ },
1518
+ "Chinese": {
1519
+ "precision": 0.06852392720736913,
1520
+ "recall": 0.11566173682214638,
1521
+ "f1": 0.08606094808126412
1522
+ },
1523
+ "German": {
1524
+ "precision": 0.06860676609676246,
1525
+ "recall": 0.29946014607812005,
1526
+ "f1": 0.1116372676689949
1527
+ },
1528
+ "French": {
1529
+ "precision": 0.06354735454716626,
1530
+ "recall": 0.2569470879330034,
1531
+ "f1": 0.10189448260246055
1532
+ },
1533
+ "Dutch": {
1534
+ "precision": 0.07336956521739131,
1535
+ "recall": 0.30178837555886734,
1536
+ "f1": 0.11804138735062665
1537
+ },
1538
+ "Swedish": {
1539
+ "precision": 0.08323618559104687,
1540
+ "recall": 0.3171453953212911,
1541
+ "f1": 0.13186407288845112
1542
+ },
1543
+ "Spanish": {
1544
+ "precision": 0.09133414462307052,
1545
+ "recall": 0.3375147928994083,
1546
+ "f1": 0.1437644923883456
1547
+ }
1548
+ }
1549
+ },
1550
+ "gretelai/gretel-gliner-bi-base-v1.0": {
1551
+ "overall": {
1552
+ "precision": 0.1410396323951752,
1553
+ "recall": 0.12955390824913604,
1554
+ "f1": 0.1350530064487742
1555
+ },
1556
+ "macro": {
1557
+ "precision": 0.07611543385685733,
1558
+ "recall": 0.08337002666730534,
1559
+ "f1": 0.049423961915537315
1560
+ },
1561
+ "by_label": {
1562
+ "name": {
1563
+ "precision": 0.025866916588566075,
1564
+ "recall": 0.09102902374670185,
1565
+ "f1": 0.040286089621953
1566
+ },
1567
+ "first_name": {
1568
+ "precision": 0.10756578947368421,
1569
+ "recall": 0.12783424550430023,
1570
+ "f1": 0.11682743837084672
1571
+ },
1572
+ "middle_name": {
1573
+ "precision": 0.0,
1574
+ "recall": 0.0,
1575
+ "f1": 0.0
1576
+ },
1577
+ "last_name": {
1578
+ "precision": 0.0247234873129473,
1579
+ "recall": 0.23824451410658307,
1580
+ "f1": 0.044798113763631006
1581
+ },
1582
+ "full_name": {
1583
+ "precision": 0.0,
1584
+ "recall": 0.0,
1585
+ "f1": 0.0
1586
+ },
1587
+ "user_name": {
1588
+ "precision": 0.05181347150259067,
1589
+ "recall": 0.2183406113537118,
1590
+ "f1": 0.08375209380234504
1591
+ },
1592
+ "email": {
1593
+ "precision": 0.2637795275590551,
1594
+ "recall": 0.4527027027027027,
1595
+ "f1": 0.33333333333333337
1596
+ },
1597
+ "phone_number": {
1598
+ "precision": 0.510752688172043,
1599
+ "recall": 0.5588235294117647,
1600
+ "f1": 0.5337078651685393
1601
+ },
1602
+ "fax_number": {
1603
+ "precision": 0.0,
1604
+ "recall": 0.0,
1605
+ "f1": 0.0
1606
+ },
1607
+ "street_address": {
1608
+ "precision": 0.06862745098039216,
1609
+ "recall": 0.019230769230769232,
1610
+ "f1": 0.030042918454935622
1611
+ },
1612
+ "city": {
1613
+ "precision": 0.3257855011409514,
1614
+ "recall": 0.33131024634059264,
1615
+ "f1": 0.3285246481989557
1616
+ },
1617
+ "state": {
1618
+ "precision": 0.24752111641571795,
1619
+ "recall": 0.6099547511312218,
1620
+ "f1": 0.3521421107628004
1621
+ },
1622
+ "county": {
1623
+ "precision": 0.2857142857142857,
1624
+ "recall": 0.018518518518518517,
1625
+ "f1": 0.034782608695652174
1626
+ },
1627
+ "postal_code": {
1628
+ "precision": 0.09644670050761421,
1629
+ "recall": 0.25333333333333335,
1630
+ "f1": 0.13970588235294118
1631
+ },
1632
+ "country": {
1633
+ "precision": 0.257001647446458,
1634
+ "recall": 0.4487917146144994,
1635
+ "f1": 0.3268384663733501
1636
+ },
1637
+ "local_latlng": {
1638
+ "precision": 0.010101010101010102,
1639
+ "recall": 0.004291845493562232,
1640
+ "f1": 0.006024096385542169
1641
+ },
1642
+ "date": {
1643
+ "precision": 0.08765088207985144,
1644
+ "recall": 0.34252539912917274,
1645
+ "f1": 0.139583025284637
1646
+ },
1647
+ "time": {
1648
+ "precision": 0.04011461318051576,
1649
+ "recall": 0.45161290322580644,
1650
+ "f1": 0.0736842105263158
1651
+ },
1652
+ "date_time": {
1653
+ "precision": 0.11711711711711711,
1654
+ "recall": 0.2635135135135135,
1655
+ "f1": 0.16216216216216214
1656
+ },
1657
+ "date_of_birth": {
1658
+ "precision": 0.0075528700906344415,
1659
+ "recall": 0.007692307692307693,
1660
+ "f1": 0.007621951219512196
1661
+ },
1662
+ "age": {
1663
+ "precision": 0.1590909090909091,
1664
+ "recall": 0.08045977011494253,
1665
+ "f1": 0.1068702290076336
1666
+ },
1667
+ "gender": {
1668
+ "precision": 0.02631578947368421,
1669
+ "recall": 0.07142857142857142,
1670
+ "f1": 0.038461538461538464
1671
+ },
1672
+ "social_security_number": {
1673
+ "precision": 0.0,
1674
+ "recall": 0.0,
1675
+ "f1": 0.0
1676
+ },
1677
+ "national_id": {
1678
+ "precision": 0.0,
1679
+ "recall": 0.0,
1680
+ "f1": 0.0
1681
+ },
1682
+ "passport_number": {
1683
+ "precision": 0.0,
1684
+ "recall": 0.0,
1685
+ "f1": 0.0
1686
+ },
1687
+ "driver_license_number": {
1688
+ "precision": 0.0,
1689
+ "recall": 0.0,
1690
+ "f1": 0.0
1691
+ },
1692
+ "certificate_license_number": {
1693
+ "precision": 0.0,
1694
+ "recall": 0.0,
1695
+ "f1": 0.0
1696
+ },
1697
+ "tax_id": {
1698
+ "precision": 0.5,
1699
+ "recall": 0.16666666666666666,
1700
+ "f1": 0.25
1701
+ },
1702
+ "bank_routing_number": {
1703
+ "precision": 0.0,
1704
+ "recall": 0.0,
1705
+ "f1": 0.0
1706
+ },
1707
+ "iban": {
1708
+ "precision": 0.0,
1709
+ "recall": 0.0,
1710
+ "f1": 0.0
1711
+ },
1712
+ "bban": {
1713
+ "precision": 0.0,
1714
+ "recall": 0.0,
1715
+ "f1": 0.0
1716
+ },
1717
+ "swift_bic_code": {
1718
+ "precision": 0.0,
1719
+ "recall": 0.0,
1720
+ "f1": 0.0
1721
+ },
1722
+ "account_number": {
1723
+ "precision": 0.0,
1724
+ "recall": 0.0,
1725
+ "f1": 0.0
1726
+ },
1727
+ "customer_id": {
1728
+ "precision": 0.0,
1729
+ "recall": 0.0,
1730
+ "f1": 0.0
1731
+ },
1732
+ "employee_id": {
1733
+ "precision": 0.0,
1734
+ "recall": 0.0,
1735
+ "f1": 0.0
1736
+ },
1737
+ "student_id": {
1738
+ "precision": 0.0,
1739
+ "recall": 0.0,
1740
+ "f1": 0.0
1741
+ },
1742
+ "patient_id": {
1743
+ "precision": 0.0,
1744
+ "recall": 0.0,
1745
+ "f1": 0.0
1746
+ },
1747
+ "unique_identifier": {
1748
+ "precision": 0.0,
1749
+ "recall": 0.0,
1750
+ "f1": 0.0
1751
+ },
1752
+ "api_key": {
1753
+ "precision": 0.0,
1754
+ "recall": 0.0,
1755
+ "f1": 0.0
1756
+ },
1757
+ "access_token": {
1758
+ "precision": 0.0,
1759
+ "recall": 0.0,
1760
+ "f1": 0.0
1761
+ },
1762
+ "password": {
1763
+ "precision": 0.0,
1764
+ "recall": 0.0,
1765
+ "f1": 0.0
1766
+ },
1767
+ "pin": {
1768
+ "precision": 0.0,
1769
+ "recall": 0.0,
1770
+ "f1": 0.0
1771
+ },
1772
+ "ipv4": {
1773
+ "precision": 0.11538461538461539,
1774
+ "recall": 0.75,
1775
+ "f1": 0.19999999999999998
1776
+ },
1777
+ "ipv6": {
1778
+ "precision": 0.0,
1779
+ "recall": 0.0,
1780
+ "f1": 0.0
1781
+ },
1782
+ "ip_address": {
1783
+ "precision": 0.0,
1784
+ "recall": 0.0,
1785
+ "f1": 0.0
1786
+ },
1787
+ "mac_address": {
1788
+ "precision": 0.0,
1789
+ "recall": 0.0,
1790
+ "f1": 0.0
1791
+ },
1792
+ "device_id": {
1793
+ "precision": 0.0,
1794
+ "recall": 0.0,
1795
+ "f1": 0.0
1796
+ },
1797
+ "imei": {
1798
+ "precision": 0.0,
1799
+ "recall": 0.0,
1800
+ "f1": 0.0
1801
+ },
1802
+ "imsi": {
1803
+ "precision": 0.0,
1804
+ "recall": 0.0,
1805
+ "f1": 0.0
1806
+ },
1807
+ "vehicle_vin": {
1808
+ "precision": 0.0,
1809
+ "recall": 0.0,
1810
+ "f1": 0.0
1811
+ },
1812
+ "license_plate": {
1813
+ "precision": 0.0,
1814
+ "recall": 0.0,
1815
+ "f1": 0.0
1816
+ },
1817
+ "credit_card_number": {
1818
+ "precision": 0.0,
1819
+ "recall": 0.0,
1820
+ "f1": 0.0
1821
+ },
1822
+ "credit_card_security_code": {
1823
+ "precision": 0.0,
1824
+ "recall": 0.0,
1825
+ "f1": 0.0
1826
+ },
1827
+ "credit_card_expiration": {
1828
+ "precision": 0.0,
1829
+ "recall": 0.0,
1830
+ "f1": 0.0
1831
+ },
1832
+ "cardholder_name": {
1833
+ "precision": 0.0,
1834
+ "recall": 0.0,
1835
+ "f1": 0.0
1836
+ },
1837
+ "card_brand": {
1838
+ "precision": 0.0,
1839
+ "recall": 0.0,
1840
+ "f1": 0.0
1841
+ },
1842
+ "company": {
1843
+ "precision": 0.3417085427135678,
1844
+ "recall": 0.04194078947368421,
1845
+ "f1": 0.07471159128364768
1846
+ },
1847
+ "organization": {
1848
+ "precision": 0.2692307692307692,
1849
+ "recall": 0.00426049908703591,
1850
+ "f1": 0.008388256440982624
1851
+ },
1852
+ "job_title": {
1853
+ "precision": 0.16666666666666666,
1854
+ "recall": 0.0041841004184100415,
1855
+ "f1": 0.008163265306122448
1856
+ },
1857
+ "occupation": {
1858
+ "precision": 0.034482758620689655,
1859
+ "recall": 0.04,
1860
+ "f1": 0.03703703703703704
1861
+ },
1862
+ "education_level": {
1863
+ "precision": 0.0,
1864
+ "recall": 0.0,
1865
+ "f1": 0.0
1866
+ },
1867
+ "employment_status": {
1868
+ "precision": 0.0,
1869
+ "recall": 0.0,
1870
+ "f1": 0.0
1871
+ },
1872
+ "url": {
1873
+ "precision": 0.17419354838709677,
1874
+ "recall": 0.07049608355091384,
1875
+ "f1": 0.10037174721189591
1876
+ },
1877
+ "http_cookie": {
1878
+ "precision": 0.0,
1879
+ "recall": 0.0,
1880
+ "f1": 0.0
1881
+ },
1882
+ "language": {
1883
+ "precision": 0.00992063492063492,
1884
+ "recall": 0.06578947368421052,
1885
+ "f1": 0.017241379310344827
1886
+ },
1887
+ "medical_record_number": {
1888
+ "precision": 0.0,
1889
+ "recall": 0.0,
1890
+ "f1": 0.0
1891
+ },
1892
+ "health_plan_beneficiary_number": {
1893
+ "precision": 0.0,
1894
+ "recall": 0.0,
1895
+ "f1": 0.0
1896
+ },
1897
+ "insurance_id": {
1898
+ "precision": 0.0,
1899
+ "recall": 0.0,
1900
+ "f1": 0.0
1901
+ },
1902
+ "provider_name": {
1903
+ "precision": 0.0,
1904
+ "recall": 0.0,
1905
+ "f1": 0.0
1906
+ },
1907
+ "hospital_name": {
1908
+ "precision": 0.25,
1909
+ "recall": 0.058823529411764705,
1910
+ "f1": 0.09523809523809523
1911
+ },
1912
+ "diagnosis": {
1913
+ "precision": 0.0,
1914
+ "recall": 0.0,
1915
+ "f1": 0.0
1916
+ },
1917
+ "procedure": {
1918
+ "precision": 0.0,
1919
+ "recall": 0.0,
1920
+ "f1": 0.0
1921
+ },
1922
+ "medication": {
1923
+ "precision": 0.0,
1924
+ "recall": 0.0,
1925
+ "f1": 0.0
1926
+ },
1927
+ "lab_result": {
1928
+ "precision": 0.0,
1929
+ "recall": 0.0,
1930
+ "f1": 0.0
1931
+ },
1932
+ "admission_date": {
1933
+ "precision": 0.0,
1934
+ "recall": 0.0,
1935
+ "f1": 0.0
1936
+ },
1937
+ "discharge_date": {
1938
+ "precision": 0.0,
1939
+ "recall": 0.0,
1940
+ "f1": 0.0
1941
+ },
1942
+ "room_number": {
1943
+ "precision": 0.0,
1944
+ "recall": 0.0,
1945
+ "f1": 0.0
1946
+ },
1947
+ "blood_type": {
1948
+ "precision": 0.0,
1949
+ "recall": 0.0,
1950
+ "f1": 0.0
1951
+ },
1952
+ "biometric_identifier": {
1953
+ "precision": 0.0,
1954
+ "recall": 0.0,
1955
+ "f1": 0.0
1956
+ },
1957
+ "race_ethnicity": {
1958
+ "precision": 0.0,
1959
+ "recall": 0.0,
1960
+ "f1": 0.0
1961
+ },
1962
+ "religious_belief": {
1963
+ "precision": 0.0,
1964
+ "recall": 0.0,
1965
+ "f1": 0.0
1966
+ },
1967
+ "political_view": {
1968
+ "precision": 0.004694835680751174,
1969
+ "recall": 0.5,
1970
+ "f1": 0.00930232558139535
1971
+ },
1972
+ "sexual_orientation": {
1973
+ "precision": 0.0,
1974
+ "recall": 0.0,
1975
+ "f1": 0.0
1976
+ },
1977
+ "product": {
1978
+ "precision": 0.0,
1979
+ "recall": 0.0,
1980
+ "f1": 0.0
1981
+ },
1982
+ "event": {
1983
+ "precision": 0.03225806451612903,
1984
+ "recall": 0.0031545741324921135,
1985
+ "f1": 0.005747126436781609
1986
+ },
1987
+ "facility": {
1988
+ "precision": 1.0,
1989
+ "recall": 0.0033783783783783786,
1990
+ "f1": 0.006734006734006735
1991
+ },
1992
+ "law": {
1993
+ "precision": 0.16666666666666666,
1994
+ "recall": 0.0196078431372549,
1995
+ "f1": 0.03508771929824561
1996
+ },
1997
+ "work_of_art": {
1998
+ "precision": 0.006024096385542169,
1999
+ "recall": 0.01818181818181818,
2000
+ "f1": 0.00904977375565611
2001
+ }
2002
+ },
2003
+ "by_language": {
2004
+ "Vietnamese": {
2005
+ "precision": 0.1424083769633508,
2006
+ "recall": 0.08740359897172237,
2007
+ "f1": 0.10832337714058143
2008
+ },
2009
+ "Indonesian": {
2010
+ "precision": 0.09373571101966163,
2011
+ "recall": 0.07435618425825172,
2012
+ "f1": 0.08292880258899675
2013
+ },
2014
+ "Korean": {
2015
+ "precision": 0.09407665505226481,
2016
+ "recall": 0.052123552123552123,
2017
+ "f1": 0.0670807453416149
2018
+ },
2019
+ "Italian": {
2020
+ "precision": 0.13994910941475827,
2021
+ "recall": 0.1559177888022679,
2022
+ "f1": 0.14750251424740193
2023
+ },
2024
+ "English": {
2025
+ "precision": 0.15890719328762917,
2026
+ "recall": 0.14776403425309229,
2027
+ "f1": 0.1531331657052704
2028
+ },
2029
+ "Japanese": {
2030
+ "precision": 0.11177644710578842,
2031
+ "recall": 0.053588516746411484,
2032
+ "f1": 0.07244501940491592
2033
+ },
2034
+ "Chinese": {
2035
+ "precision": 0.09268292682926829,
2036
+ "recall": 0.02161547212741752,
2037
+ "f1": 0.03505535055350554
2038
+ },
2039
+ "German": {
2040
+ "precision": 0.12097407698350353,
2041
+ "recall": 0.1467132422991426,
2042
+ "f1": 0.13260619977037888
2043
+ },
2044
+ "French": {
2045
+ "precision": 0.11522478277295051,
2046
+ "recall": 0.11610201751046821,
2047
+ "f1": 0.11566173682214637
2048
+ },
2049
+ "Dutch": {
2050
+ "precision": 0.1357304826692582,
2051
+ "recall": 0.1561102831594635,
2052
+ "f1": 0.14520880263385896
2053
+ },
2054
+ "Swedish": {
2055
+ "precision": 0.14919816723940435,
2056
+ "recall": 0.15427894580989043,
2057
+ "f1": 0.1516960256223613
2058
+ },
2059
+ "Spanish": {
2060
+ "precision": 0.1692557384651055,
2061
+ "recall": 0.1727810650887574,
2062
+ "f1": 0.17100023424689623
2063
+ }
2064
+ }
2065
+ },
2066
+ "knowledgator/gliner-pii-base-v1.0": {
2067
+ "overall": {
2068
+ "precision": 0.15197501951600312,
2069
+ "recall": 0.12839317276492468,
2070
+ "f1": 0.13919235829091117
2071
+ },
2072
+ "macro": {
2073
+ "precision": 0.059436781522773686,
2074
+ "recall": 0.07493812625021153,
2075
+ "f1": 0.04736717299282216
2076
+ },
2077
+ "by_label": {
2078
+ "name": {
2079
+ "precision": 0.0,
2080
+ "recall": 0.0,
2081
+ "f1": 0.0
2082
+ },
2083
+ "first_name": {
2084
+ "precision": 0.009213051823416507,
2085
+ "recall": 0.009382329945269743,
2086
+ "f1": 0.009296920395119118
2087
+ },
2088
+ "middle_name": {
2089
+ "precision": 0.0,
2090
+ "recall": 0.0,
2091
+ "f1": 0.0
2092
+ },
2093
+ "last_name": {
2094
+ "precision": 0.006252605252188412,
2095
+ "recall": 0.09404388714733543,
2096
+ "f1": 0.01172562048075044
2097
+ },
2098
+ "full_name": {
2099
+ "precision": 0.0,
2100
+ "recall": 0.0,
2101
+ "f1": 0.0
2102
+ },
2103
+ "user_name": {
2104
+ "precision": 0.14285714285714285,
2105
+ "recall": 0.004366812227074236,
2106
+ "f1": 0.008474576271186439
2107
+ },
2108
+ "email": {
2109
+ "precision": 0.4528301886792453,
2110
+ "recall": 0.16216216216216217,
2111
+ "f1": 0.23880597014925375
2112
+ },
2113
+ "phone_number": {
2114
+ "precision": 0.5887850467289719,
2115
+ "recall": 0.37058823529411766,
2116
+ "f1": 0.4548736462093863
2117
+ },
2118
+ "fax_number": {
2119
+ "precision": 0.0,
2120
+ "recall": 0.0,
2121
+ "f1": 0.0
2122
+ },
2123
+ "street_address": {
2124
+ "precision": 0.0,
2125
+ "recall": 0.0,
2126
+ "f1": 0.0
2127
+ },
2128
+ "city": {
2129
+ "precision": 0.2857541899441341,
2130
+ "recall": 0.36522670474830415,
2131
+ "f1": 0.32063939821344617
2132
+ },
2133
+ "state": {
2134
+ "precision": 0.27267605633802816,
2135
+ "recall": 0.43800904977375565,
2136
+ "f1": 0.3361111111111111
2137
+ },
2138
+ "county": {
2139
+ "precision": 0.3493975903614458,
2140
+ "recall": 0.11507936507936507,
2141
+ "f1": 0.17313432835820897
2142
+ },
2143
+ "postal_code": {
2144
+ "precision": 0.3333333333333333,
2145
+ "recall": 0.02666666666666667,
2146
+ "f1": 0.04938271604938272
2147
+ },
2148
+ "country": {
2149
+ "precision": 0.16430138990490126,
2150
+ "recall": 0.64614499424626,
2151
+ "f1": 0.26198530269450604
2152
+ },
2153
+ "local_latlng": {
2154
+ "precision": 0.0,
2155
+ "recall": 0.0,
2156
+ "f1": 0.0
2157
+ },
2158
+ "date": {
2159
+ "precision": 0.14948453608247422,
2160
+ "recall": 0.02104499274310595,
2161
+ "f1": 0.03689567430025445
2162
+ },
2163
+ "time": {
2164
+ "precision": 0.0,
2165
+ "recall": 0.0,
2166
+ "f1": 0.0
2167
+ },
2168
+ "date_time": {
2169
+ "precision": 0.0,
2170
+ "recall": 0.0,
2171
+ "f1": 0.0
2172
+ },
2173
+ "date_of_birth": {
2174
+ "precision": 0.0,
2175
+ "recall": 0.0,
2176
+ "f1": 0.0
2177
+ },
2178
+ "age": {
2179
+ "precision": 0.0,
2180
+ "recall": 0.0,
2181
+ "f1": 0.0
2182
+ },
2183
+ "gender": {
2184
+ "precision": 0.07142857142857142,
2185
+ "recall": 0.14285714285714285,
2186
+ "f1": 0.09523809523809523
2187
+ },
2188
+ "social_security_number": {
2189
+ "precision": 0.2,
2190
+ "recall": 0.5,
2191
+ "f1": 0.28571428571428575
2192
+ },
2193
+ "national_id": {
2194
+ "precision": 0.0,
2195
+ "recall": 0.0,
2196
+ "f1": 0.0
2197
+ },
2198
+ "passport_number": {
2199
+ "precision": 0.0,
2200
+ "recall": 0.0,
2201
+ "f1": 0.0
2202
+ },
2203
+ "driver_license_number": {
2204
+ "precision": 0.0,
2205
+ "recall": 0.0,
2206
+ "f1": 0.0
2207
+ },
2208
+ "certificate_license_number": {
2209
+ "precision": 0.0,
2210
+ "recall": 0.0,
2211
+ "f1": 0.0
2212
+ },
2213
+ "tax_id": {
2214
+ "precision": 0.0,
2215
+ "recall": 0.0,
2216
+ "f1": 0.0
2217
+ },
2218
+ "bank_routing_number": {
2219
+ "precision": 0.0,
2220
+ "recall": 0.0,
2221
+ "f1": 0.0
2222
+ },
2223
+ "iban": {
2224
+ "precision": 0.0,
2225
+ "recall": 0.0,
2226
+ "f1": 0.0
2227
+ },
2228
+ "bban": {
2229
+ "precision": 0.0,
2230
+ "recall": 0.0,
2231
+ "f1": 0.0
2232
+ },
2233
+ "swift_bic_code": {
2234
+ "precision": 0.0,
2235
+ "recall": 0.0,
2236
+ "f1": 0.0
2237
+ },
2238
+ "account_number": {
2239
+ "precision": 0.0,
2240
+ "recall": 0.0,
2241
+ "f1": 0.0
2242
+ },
2243
+ "customer_id": {
2244
+ "precision": 0.0,
2245
+ "recall": 0.0,
2246
+ "f1": 0.0
2247
+ },
2248
+ "employee_id": {
2249
+ "precision": 0.0,
2250
+ "recall": 0.0,
2251
+ "f1": 0.0
2252
+ },
2253
+ "student_id": {
2254
+ "precision": 0.0,
2255
+ "recall": 0.0,
2256
+ "f1": 0.0
2257
+ },
2258
+ "patient_id": {
2259
+ "precision": 0.0,
2260
+ "recall": 0.0,
2261
+ "f1": 0.0
2262
+ },
2263
+ "unique_identifier": {
2264
+ "precision": 0.0,
2265
+ "recall": 0.0,
2266
+ "f1": 0.0
2267
+ },
2268
+ "api_key": {
2269
+ "precision": 0.0,
2270
+ "recall": 0.0,
2271
+ "f1": 0.0
2272
+ },
2273
+ "access_token": {
2274
+ "precision": 0.0,
2275
+ "recall": 0.0,
2276
+ "f1": 0.0
2277
+ },
2278
+ "password": {
2279
+ "precision": 0.0,
2280
+ "recall": 0.0,
2281
+ "f1": 0.0
2282
+ },
2283
+ "pin": {
2284
+ "precision": 0.0,
2285
+ "recall": 0.0,
2286
+ "f1": 0.0
2287
+ },
2288
+ "ipv4": {
2289
+ "precision": 0.0,
2290
+ "recall": 0.0,
2291
+ "f1": 0.0
2292
+ },
2293
+ "ipv6": {
2294
+ "precision": 0.0,
2295
+ "recall": 0.0,
2296
+ "f1": 0.0
2297
+ },
2298
+ "ip_address": {
2299
+ "precision": 0.0,
2300
+ "recall": 0.0,
2301
+ "f1": 0.0
2302
+ },
2303
+ "mac_address": {
2304
+ "precision": 0.0,
2305
+ "recall": 0.0,
2306
+ "f1": 0.0
2307
+ },
2308
+ "device_id": {
2309
+ "precision": 0.0,
2310
+ "recall": 0.0,
2311
+ "f1": 0.0
2312
+ },
2313
+ "imei": {
2314
+ "precision": 0.0,
2315
+ "recall": 0.0,
2316
+ "f1": 0.0
2317
+ },
2318
+ "imsi": {
2319
+ "precision": 0.0,
2320
+ "recall": 0.0,
2321
+ "f1": 0.0
2322
+ },
2323
+ "vehicle_vin": {
2324
+ "precision": 0.020833333333333332,
2325
+ "recall": 0.09090909090909091,
2326
+ "f1": 0.03389830508474576
2327
+ },
2328
+ "license_plate": {
2329
+ "precision": 0.0,
2330
+ "recall": 0.0,
2331
+ "f1": 0.0
2332
+ },
2333
+ "credit_card_number": {
2334
+ "precision": 0.0,
2335
+ "recall": 0.0,
2336
+ "f1": 0.0
2337
+ },
2338
+ "credit_card_security_code": {
2339
+ "precision": 0.0,
2340
+ "recall": 0.0,
2341
+ "f1": 0.0
2342
+ },
2343
+ "credit_card_expiration": {
2344
+ "precision": 0.0,
2345
+ "recall": 0.0,
2346
+ "f1": 0.0
2347
+ },
2348
+ "cardholder_name": {
2349
+ "precision": 0.0,
2350
+ "recall": 0.0,
2351
+ "f1": 0.0
2352
+ },
2353
+ "card_brand": {
2354
+ "precision": 0.0,
2355
+ "recall": 0.0,
2356
+ "f1": 0.0
2357
+ },
2358
+ "company": {
2359
+ "precision": 0.2413793103448276,
2360
+ "recall": 0.13527960526315788,
2361
+ "f1": 0.17338603425559945
2362
+ },
2363
+ "organization": {
2364
+ "precision": 0.11622073578595318,
2365
+ "recall": 0.08460133901399879,
2366
+ "f1": 0.0979218034519197
2367
+ },
2368
+ "job_title": {
2369
+ "precision": 0.0,
2370
+ "recall": 0.0,
2371
+ "f1": 0.0
2372
+ },
2373
+ "occupation": {
2374
+ "precision": 0.0,
2375
+ "recall": 0.0,
2376
+ "f1": 0.0
2377
+ },
2378
+ "education_level": {
2379
+ "precision": 0.0,
2380
+ "recall": 0.0,
2381
+ "f1": 0.0
2382
+ },
2383
+ "employment_status": {
2384
+ "precision": 0.0,
2385
+ "recall": 0.0,
2386
+ "f1": 0.0
2387
+ },
2388
+ "url": {
2389
+ "precision": 0.19230769230769232,
2390
+ "recall": 0.013054830287206266,
2391
+ "f1": 0.02444987775061124
2392
+ },
2393
+ "http_cookie": {
2394
+ "precision": 0.0,
2395
+ "recall": 0.0,
2396
+ "f1": 0.0
2397
+ },
2398
+ "language": {
2399
+ "precision": 0.013333333333333334,
2400
+ "recall": 0.039473684210526314,
2401
+ "f1": 0.019933554817275746
2402
+ },
2403
+ "medical_record_number": {
2404
+ "precision": 0.0,
2405
+ "recall": 0.0,
2406
+ "f1": 0.0
2407
+ },
2408
+ "health_plan_beneficiary_number": {
2409
+ "precision": 0.0,
2410
+ "recall": 0.0,
2411
+ "f1": 0.0
2412
+ },
2413
+ "insurance_id": {
2414
+ "precision": 0.0,
2415
+ "recall": 0.0,
2416
+ "f1": 0.0
2417
+ },
2418
+ "provider_name": {
2419
+ "precision": 0.0,
2420
+ "recall": 0.0,
2421
+ "f1": 0.0
2422
+ },
2423
+ "hospital_name": {
2424
+ "precision": 0.0,
2425
+ "recall": 0.0,
2426
+ "f1": 0.0
2427
+ },
2428
+ "diagnosis": {
2429
+ "precision": 0.01764705882352941,
2430
+ "recall": 0.4090909090909091,
2431
+ "f1": 0.03383458646616541
2432
+ },
2433
+ "procedure": {
2434
+ "precision": 0.08670520231213873,
2435
+ "recall": 0.39473684210526316,
2436
+ "f1": 0.14218009478672985
2437
+ },
2438
+ "medication": {
2439
+ "precision": 0.010471204188481676,
2440
+ "recall": 0.4,
2441
+ "f1": 0.020408163265306124
2442
+ },
2443
+ "lab_result": {
2444
+ "precision": 0.0,
2445
+ "recall": 0.0,
2446
+ "f1": 0.0
2447
+ },
2448
+ "admission_date": {
2449
+ "precision": 0.0,
2450
+ "recall": 0.0,
2451
+ "f1": 0.0
2452
+ },
2453
+ "discharge_date": {
2454
+ "precision": 0.0,
2455
+ "recall": 0.0,
2456
+ "f1": 0.0
2457
+ },
2458
+ "room_number": {
2459
+ "precision": 0.0,
2460
+ "recall": 0.0,
2461
+ "f1": 0.0
2462
+ },
2463
+ "blood_type": {
2464
+ "precision": 0.0,
2465
+ "recall": 0.0,
2466
+ "f1": 0.0
2467
+ },
2468
+ "biometric_identifier": {
2469
+ "precision": 0.0,
2470
+ "recall": 0.0,
2471
+ "f1": 0.0
2472
+ },
2473
+ "race_ethnicity": {
2474
+ "precision": 0.04878048780487805,
2475
+ "recall": 0.13333333333333333,
2476
+ "f1": 0.07142857142857142
2477
+ },
2478
+ "religious_belief": {
2479
+ "precision": 0.0,
2480
+ "recall": 0.0,
2481
+ "f1": 0.0
2482
+ },
2483
+ "political_view": {
2484
+ "precision": 0.0,
2485
+ "recall": 0.0,
2486
+ "f1": 0.0
2487
+ },
2488
+ "sexual_orientation": {
2489
+ "precision": 0.0,
2490
+ "recall": 0.0,
2491
+ "f1": 0.0
2492
+ },
2493
+ "product": {
2494
+ "precision": 0.05562273276904474,
2495
+ "recall": 0.057356608478802994,
2496
+ "f1": 0.056476365868631064
2497
+ },
2498
+ "event": {
2499
+ "precision": 0.03689567430025445,
2500
+ "recall": 0.0914826498422713,
2501
+ "f1": 0.052583862194016305
2502
+ },
2503
+ "facility": {
2504
+ "precision": 0.125,
2505
+ "recall": 0.07432432432432433,
2506
+ "f1": 0.09322033898305085
2507
+ },
2508
+ "law": {
2509
+ "precision": 0.10759493670886076,
2510
+ "recall": 0.3333333333333333,
2511
+ "f1": 0.16267942583732056
2512
+ },
2513
+ "work_of_art": {
2514
+ "precision": 0.0020325203252032522,
2515
+ "recall": 0.01818181818181818,
2516
+ "f1": 0.003656307129798903
2517
+ }
2518
+ },
2519
+ "by_language": {
2520
+ "Vietnamese": {
2521
+ "precision": 0.14086687306501547,
2522
+ "recall": 0.05848329048843188,
2523
+ "f1": 0.08265213442325159
2524
+ },
2525
+ "Indonesian": {
2526
+ "precision": 0.14808153477218225,
2527
+ "recall": 0.08959013420384476,
2528
+ "f1": 0.11163841807909605
2529
+ },
2530
+ "Korean": {
2531
+ "precision": 0.15675675675675677,
2532
+ "recall": 0.055984555984555984,
2533
+ "f1": 0.08250355618776671
2534
+ },
2535
+ "Italian": {
2536
+ "precision": 0.13670999551770507,
2537
+ "recall": 0.10807937632884479,
2538
+ "f1": 0.12072036414011479
2539
+ },
2540
+ "English": {
2541
+ "precision": 0.15230647822655857,
2542
+ "recall": 0.20199809705042818,
2543
+ "f1": 0.17366763466808458
2544
+ },
2545
+ "Japanese": {
2546
+ "precision": 0.11004784688995216,
2547
+ "recall": 0.02200956937799043,
2548
+ "f1": 0.036682615629984046
2549
+ },
2550
+ "Chinese": {
2551
+ "precision": 0.10309278350515463,
2552
+ "recall": 0.0037921880925293893,
2553
+ "f1": 0.0073152889539136795
2554
+ },
2555
+ "German": {
2556
+ "precision": 0.15033407572383073,
2557
+ "recall": 0.12861225785963798,
2558
+ "f1": 0.1386274174225569
2559
+ },
2560
+ "French": {
2561
+ "precision": 0.1211977186311787,
2562
+ "recall": 0.0970688998858013,
2563
+ "f1": 0.1077996195307546
2564
+ },
2565
+ "Dutch": {
2566
+ "precision": 0.14109799897383274,
2567
+ "recall": 0.10245901639344263,
2568
+ "f1": 0.11871357651629615
2569
+ },
2570
+ "Swedish": {
2571
+ "precision": 0.15603847524047026,
2572
+ "recall": 0.12970091797453362,
2573
+ "f1": 0.14165588615782668
2574
+ },
2575
+ "Spanish": {
2576
+ "precision": 0.19050343249427917,
2577
+ "recall": 0.15763313609467455,
2578
+ "f1": 0.17251651340499938
2579
+ }
2580
+ }
2581
+ }
2582
+ }
gliner_config.json ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "class_token_index": 250103,
3
+ "dropout": 0.4,
4
+ "embed_ent_token": true,
5
+ "encoder_config": {
6
+ "_name_or_path": "microsoft/mdeberta-v3-base",
7
+ "architectures": null,
8
+ "attention_probs_dropout_prob": 0.1,
9
+ "bos_token_id": null,
10
+ "chunk_size_feed_forward": 0,
11
+ "dtype": null,
12
+ "eos_token_id": null,
13
+ "hidden_act": "gelu",
14
+ "hidden_dropout_prob": 0.1,
15
+ "hidden_size": 768,
16
+ "id2label": {
17
+ "0": "LABEL_0",
18
+ "1": "LABEL_1"
19
+ },
20
+ "initializer_range": 0.02,
21
+ "intermediate_size": 3072,
22
+ "is_encoder_decoder": false,
23
+ "label2id": {
24
+ "LABEL_0": 0,
25
+ "LABEL_1": 1
26
+ },
27
+ "layer_norm_eps": 1e-07,
28
+ "legacy": true,
29
+ "max_position_embeddings": 512,
30
+ "max_relative_positions": -1,
31
+ "model_type": "deberta-v2",
32
+ "norm_rel_ebd": "layer_norm",
33
+ "num_attention_heads": 12,
34
+ "num_hidden_layers": 12,
35
+ "output_attentions": false,
36
+ "output_hidden_states": false,
37
+ "pad_token_id": 0,
38
+ "pooler_dropout": 0,
39
+ "pooler_hidden_act": "gelu",
40
+ "pooler_hidden_size": 768,
41
+ "pos_att_type": [
42
+ "p2c",
43
+ "c2p"
44
+ ],
45
+ "position_biased_input": false,
46
+ "position_buckets": 256,
47
+ "problem_type": null,
48
+ "relative_attention": true,
49
+ "return_dict": true,
50
+ "share_att_key": true,
51
+ "tie_word_embeddings": true,
52
+ "type_vocab_size": 0,
53
+ "vocab_size": 250105
54
+ },
55
+ "ent_token": "<<ENT>>",
56
+ "eval_every": 5000,
57
+ "fine_tune": true,
58
+ "fuse_layers": false,
59
+ "hidden_size": 512,
60
+ "lr_encoder": "1e-5",
61
+ "lr_others": "5e-5",
62
+ "max_len": 384,
63
+ "max_neg_type_ratio": 1,
64
+ "max_types": 25,
65
+ "max_width": 12,
66
+ "model_name": "microsoft/mdeberta-v3-base",
67
+ "model_type": null,
68
+ "name": "correct",
69
+ "num_post_fusion_layers": 1,
70
+ "num_rnn_layers": 1,
71
+ "num_steps": 30000,
72
+ "post_fusion_schema": "",
73
+ "random_drop": true,
74
+ "sep_token": "<<SEP>>",
75
+ "shuffle_types": true,
76
+ "size_sup": -1,
77
+ "span_mode": "markerV0",
78
+ "subtoken_pooling": "first",
79
+ "train_batch_size": 8,
80
+ "transformers_version": "5.1.0",
81
+ "use_cache": false,
82
+ "vocab_size": 250105,
83
+ "warmup_ratio": 3000,
84
+ "words_splitter_type": "whitespace"
85
+ }
label_schema.json ADDED
@@ -0,0 +1,442 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "name": "name",
4
+ "category": "PII",
5
+ "description": "Full name (unspecified structure)"
6
+ },
7
+ {
8
+ "name": "first_name",
9
+ "category": "PII",
10
+ "description": "Given name"
11
+ },
12
+ {
13
+ "name": "middle_name",
14
+ "category": "PII",
15
+ "description": "Middle name"
16
+ },
17
+ {
18
+ "name": "last_name",
19
+ "category": "PII",
20
+ "description": "Family name"
21
+ },
22
+ {
23
+ "name": "full_name",
24
+ "category": "PII",
25
+ "description": "Complete person name"
26
+ },
27
+ {
28
+ "name": "user_name",
29
+ "category": "PII",
30
+ "description": "Username or handle"
31
+ },
32
+ {
33
+ "name": "email",
34
+ "category": "PII",
35
+ "description": "Email address"
36
+ },
37
+ {
38
+ "name": "phone_number",
39
+ "category": "PII",
40
+ "description": "Phone number"
41
+ },
42
+ {
43
+ "name": "fax_number",
44
+ "category": "PII",
45
+ "description": "Fax number"
46
+ },
47
+ {
48
+ "name": "street_address",
49
+ "category": "PII",
50
+ "description": "Street address"
51
+ },
52
+ {
53
+ "name": "city",
54
+ "category": "PII",
55
+ "description": "City"
56
+ },
57
+ {
58
+ "name": "state",
59
+ "category": "PII",
60
+ "description": "State or province"
61
+ },
62
+ {
63
+ "name": "county",
64
+ "category": "PII",
65
+ "description": "County or district"
66
+ },
67
+ {
68
+ "name": "postal_code",
69
+ "category": "PII",
70
+ "description": "Postal or ZIP code"
71
+ },
72
+ {
73
+ "name": "country",
74
+ "category": "PII",
75
+ "description": "Country"
76
+ },
77
+ {
78
+ "name": "local_latlng",
79
+ "category": "PII",
80
+ "description": "Latitude/longitude coordinates"
81
+ },
82
+ {
83
+ "name": "date",
84
+ "category": "PII",
85
+ "description": "Date"
86
+ },
87
+ {
88
+ "name": "time",
89
+ "category": "PII",
90
+ "description": "Time"
91
+ },
92
+ {
93
+ "name": "date_time",
94
+ "category": "PII",
95
+ "description": "Date and time"
96
+ },
97
+ {
98
+ "name": "date_of_birth",
99
+ "category": "PII",
100
+ "description": "Date of birth"
101
+ },
102
+ {
103
+ "name": "age",
104
+ "category": "PII",
105
+ "description": "Age"
106
+ },
107
+ {
108
+ "name": "gender",
109
+ "category": "PII",
110
+ "description": "Gender"
111
+ },
112
+ {
113
+ "name": "social_security_number",
114
+ "category": "PII",
115
+ "description": "Social Security Number"
116
+ },
117
+ {
118
+ "name": "national_id",
119
+ "category": "PII",
120
+ "description": "National ID number"
121
+ },
122
+ {
123
+ "name": "passport_number",
124
+ "category": "PII",
125
+ "description": "Passport number"
126
+ },
127
+ {
128
+ "name": "driver_license_number",
129
+ "category": "PII",
130
+ "description": "Driver license number"
131
+ },
132
+ {
133
+ "name": "certificate_license_number",
134
+ "category": "PII",
135
+ "description": "Certificate or license number"
136
+ },
137
+ {
138
+ "name": "tax_id",
139
+ "category": "PII",
140
+ "description": "Tax ID"
141
+ },
142
+ {
143
+ "name": "bank_routing_number",
144
+ "category": "PII",
145
+ "description": "Bank routing number"
146
+ },
147
+ {
148
+ "name": "iban",
149
+ "category": "PII",
150
+ "description": "International Bank Account Number"
151
+ },
152
+ {
153
+ "name": "bban",
154
+ "category": "PII",
155
+ "description": "Basic Bank Account Number"
156
+ },
157
+ {
158
+ "name": "swift_bic_code",
159
+ "category": "PII",
160
+ "description": "SWIFT/BIC code"
161
+ },
162
+ {
163
+ "name": "account_number",
164
+ "category": "PII",
165
+ "description": "Bank account number"
166
+ },
167
+ {
168
+ "name": "customer_id",
169
+ "category": "PII",
170
+ "description": "Customer ID"
171
+ },
172
+ {
173
+ "name": "employee_id",
174
+ "category": "PII",
175
+ "description": "Employee ID"
176
+ },
177
+ {
178
+ "name": "student_id",
179
+ "category": "PII",
180
+ "description": "Student ID"
181
+ },
182
+ {
183
+ "name": "patient_id",
184
+ "category": "PII",
185
+ "description": "Patient ID"
186
+ },
187
+ {
188
+ "name": "unique_identifier",
189
+ "category": "PII",
190
+ "description": "Unique identifier"
191
+ },
192
+ {
193
+ "name": "api_key",
194
+ "category": "PII",
195
+ "description": "API key"
196
+ },
197
+ {
198
+ "name": "access_token",
199
+ "category": "PII",
200
+ "description": "Access token"
201
+ },
202
+ {
203
+ "name": "password",
204
+ "category": "PII",
205
+ "description": "Password"
206
+ },
207
+ {
208
+ "name": "pin",
209
+ "category": "PII",
210
+ "description": "PIN"
211
+ },
212
+ {
213
+ "name": "ipv4",
214
+ "category": "PII",
215
+ "description": "IPv4 address"
216
+ },
217
+ {
218
+ "name": "ipv6",
219
+ "category": "PII",
220
+ "description": "IPv6 address"
221
+ },
222
+ {
223
+ "name": "ip_address",
224
+ "category": "PII",
225
+ "description": "IP address (generic)"
226
+ },
227
+ {
228
+ "name": "mac_address",
229
+ "category": "PII",
230
+ "description": "MAC address"
231
+ },
232
+ {
233
+ "name": "device_id",
234
+ "category": "PII",
235
+ "description": "Device identifier"
236
+ },
237
+ {
238
+ "name": "imei",
239
+ "category": "PII",
240
+ "description": "IMEI"
241
+ },
242
+ {
243
+ "name": "imsi",
244
+ "category": "PII",
245
+ "description": "IMSI"
246
+ },
247
+ {
248
+ "name": "vehicle_vin",
249
+ "category": "PII",
250
+ "description": "Vehicle identification number"
251
+ },
252
+ {
253
+ "name": "license_plate",
254
+ "category": "PII",
255
+ "description": "License plate"
256
+ },
257
+ {
258
+ "name": "credit_card_number",
259
+ "category": "PCI",
260
+ "description": "Credit/debit card number"
261
+ },
262
+ {
263
+ "name": "credit_card_security_code",
264
+ "category": "PCI",
265
+ "description": "Card CVV/CVC"
266
+ },
267
+ {
268
+ "name": "credit_card_expiration",
269
+ "category": "PCI",
270
+ "description": "Card expiration date"
271
+ },
272
+ {
273
+ "name": "cardholder_name",
274
+ "category": "PCI",
275
+ "description": "Cardholder name"
276
+ },
277
+ {
278
+ "name": "card_brand",
279
+ "category": "PCI",
280
+ "description": "Card brand"
281
+ },
282
+ {
283
+ "name": "company",
284
+ "category": "PII",
285
+ "description": "Company name"
286
+ },
287
+ {
288
+ "name": "organization",
289
+ "category": "PII",
290
+ "description": "Organization name"
291
+ },
292
+ {
293
+ "name": "job_title",
294
+ "category": "PII",
295
+ "description": "Job title"
296
+ },
297
+ {
298
+ "name": "occupation",
299
+ "category": "PII",
300
+ "description": "Occupation"
301
+ },
302
+ {
303
+ "name": "education_level",
304
+ "category": "PII",
305
+ "description": "Education level"
306
+ },
307
+ {
308
+ "name": "employment_status",
309
+ "category": "PII",
310
+ "description": "Employment status"
311
+ },
312
+ {
313
+ "name": "url",
314
+ "category": "PII",
315
+ "description": "URL"
316
+ },
317
+ {
318
+ "name": "http_cookie",
319
+ "category": "PII",
320
+ "description": "HTTP cookie"
321
+ },
322
+ {
323
+ "name": "language",
324
+ "category": "PII",
325
+ "description": "Language"
326
+ },
327
+ {
328
+ "name": "medical_record_number",
329
+ "category": "PHI",
330
+ "description": "Medical record number"
331
+ },
332
+ {
333
+ "name": "health_plan_beneficiary_number",
334
+ "category": "PHI",
335
+ "description": "Health plan beneficiary number"
336
+ },
337
+ {
338
+ "name": "insurance_id",
339
+ "category": "PHI",
340
+ "description": "Insurance ID"
341
+ },
342
+ {
343
+ "name": "provider_name",
344
+ "category": "PHI",
345
+ "description": "Provider name"
346
+ },
347
+ {
348
+ "name": "hospital_name",
349
+ "category": "PHI",
350
+ "description": "Hospital/clinic name"
351
+ },
352
+ {
353
+ "name": "diagnosis",
354
+ "category": "PHI",
355
+ "description": "Diagnosis"
356
+ },
357
+ {
358
+ "name": "procedure",
359
+ "category": "PHI",
360
+ "description": "Procedure"
361
+ },
362
+ {
363
+ "name": "medication",
364
+ "category": "PHI",
365
+ "description": "Medication"
366
+ },
367
+ {
368
+ "name": "lab_result",
369
+ "category": "PHI",
370
+ "description": "Lab result"
371
+ },
372
+ {
373
+ "name": "admission_date",
374
+ "category": "PHI",
375
+ "description": "Admission date"
376
+ },
377
+ {
378
+ "name": "discharge_date",
379
+ "category": "PHI",
380
+ "description": "Discharge date"
381
+ },
382
+ {
383
+ "name": "room_number",
384
+ "category": "PHI",
385
+ "description": "Room number"
386
+ },
387
+ {
388
+ "name": "blood_type",
389
+ "category": "PHI",
390
+ "description": "Blood type"
391
+ },
392
+ {
393
+ "name": "biometric_identifier",
394
+ "category": "PHI",
395
+ "description": "Biometric identifier"
396
+ },
397
+ {
398
+ "name": "race_ethnicity",
399
+ "category": "Sensitive",
400
+ "description": "Race or ethnicity"
401
+ },
402
+ {
403
+ "name": "religious_belief",
404
+ "category": "Sensitive",
405
+ "description": "Religious belief"
406
+ },
407
+ {
408
+ "name": "political_view",
409
+ "category": "Sensitive",
410
+ "description": "Political view"
411
+ },
412
+ {
413
+ "name": "sexual_orientation",
414
+ "category": "Sensitive",
415
+ "description": "Sexual orientation"
416
+ },
417
+ {
418
+ "name": "product",
419
+ "category": "Custom",
420
+ "description": "Product or service name"
421
+ },
422
+ {
423
+ "name": "event",
424
+ "category": "Custom",
425
+ "description": "Event name (tournaments, conferences, disasters, etc.)"
426
+ },
427
+ {
428
+ "name": "facility",
429
+ "category": "Custom",
430
+ "description": "Facility or building name (airports, stadiums, stations)"
431
+ },
432
+ {
433
+ "name": "law",
434
+ "category": "Custom",
435
+ "description": "Law, act, or regulation name"
436
+ },
437
+ {
438
+ "name": "work_of_art",
439
+ "category": "Custom",
440
+ "description": "Creative work title (books, films, songs, artworks)"
441
+ }
442
+ ]
metrics.json ADDED
@@ -0,0 +1,516 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "overall": {
3
+ "precision": 0.2837362121567707,
4
+ "recall": 0.3189384546389849,
5
+ "f1": 0.30030925146242404
6
+ },
7
+ "macro": {
8
+ "precision": 0.10309275828076389,
9
+ "recall": 0.11927487570986003,
10
+ "f1": 0.08813121523006051
11
+ },
12
+ "by_label": {
13
+ "name": {
14
+ "precision": 0.07758620689655173,
15
+ "recall": 0.011873350923482849,
16
+ "f1": 0.020594965675057208
17
+ },
18
+ "first_name": {
19
+ "precision": 0.25936811168258633,
20
+ "recall": 0.13799843627834246,
21
+ "f1": 0.1801479969379944
22
+ },
23
+ "middle_name": {
24
+ "precision": 0.0,
25
+ "recall": 0.0,
26
+ "f1": 0.0
27
+ },
28
+ "last_name": {
29
+ "precision": 0.0,
30
+ "recall": 0.0,
31
+ "f1": 0.0
32
+ },
33
+ "full_name": {
34
+ "precision": 0.6101829753381066,
35
+ "recall": 0.067387102442453,
36
+ "f1": 0.12137036157923888
37
+ },
38
+ "user_name": {
39
+ "precision": 0.3359375,
40
+ "recall": 0.18777292576419213,
41
+ "f1": 0.24089635854341737
42
+ },
43
+ "email": {
44
+ "precision": 0.4766355140186916,
45
+ "recall": 0.34459459459459457,
46
+ "f1": 0.4
47
+ },
48
+ "phone_number": {
49
+ "precision": 0.6865671641791045,
50
+ "recall": 0.27058823529411763,
51
+ "f1": 0.38818565400843885
52
+ },
53
+ "fax_number": {
54
+ "precision": 0.0,
55
+ "recall": 0.0,
56
+ "f1": 0.0
57
+ },
58
+ "street_address": {
59
+ "precision": 0.20294117647058824,
60
+ "recall": 0.18956043956043955,
61
+ "f1": 0.1960227272727273
62
+ },
63
+ "city": {
64
+ "precision": 0.32969146726167575,
65
+ "recall": 0.7649053909318101,
66
+ "f1": 0.46077746115382545
67
+ },
68
+ "state": {
69
+ "precision": 0.33976510067114096,
70
+ "recall": 0.7330316742081447,
71
+ "f1": 0.4643164230438521
72
+ },
73
+ "county": {
74
+ "precision": 0.28700564971751413,
75
+ "recall": 0.335978835978836,
76
+ "f1": 0.30956733698964045
77
+ },
78
+ "postal_code": {
79
+ "precision": 0.1891891891891892,
80
+ "recall": 0.09333333333333334,
81
+ "f1": 0.125
82
+ },
83
+ "country": {
84
+ "precision": 0.28795180722891567,
85
+ "recall": 0.8250863060989643,
86
+ "f1": 0.4269127716582316
87
+ },
88
+ "local_latlng": {
89
+ "precision": 0.09944751381215469,
90
+ "recall": 0.07725321888412018,
91
+ "f1": 0.08695652173913043
92
+ },
93
+ "date": {
94
+ "precision": 0.11542936932743775,
95
+ "recall": 0.3599419448476052,
96
+ "f1": 0.17480176211453743
97
+ },
98
+ "time": {
99
+ "precision": 0.05263157894736842,
100
+ "recall": 0.14516129032258066,
101
+ "f1": 0.07725321888412016
102
+ },
103
+ "date_time": {
104
+ "precision": 0.0,
105
+ "recall": 0.0,
106
+ "f1": 0.0
107
+ },
108
+ "date_of_birth": {
109
+ "precision": 0.29508196721311475,
110
+ "recall": 0.027692307692307693,
111
+ "f1": 0.05063291139240506
112
+ },
113
+ "age": {
114
+ "precision": 0.10869565217391304,
115
+ "recall": 0.05747126436781609,
116
+ "f1": 0.07518796992481203
117
+ },
118
+ "gender": {
119
+ "precision": 0.0,
120
+ "recall": 0.0,
121
+ "f1": 0.0
122
+ },
123
+ "social_security_number": {
124
+ "precision": 0.0,
125
+ "recall": 0.0,
126
+ "f1": 0.0
127
+ },
128
+ "national_id": {
129
+ "precision": 0.0,
130
+ "recall": 0.0,
131
+ "f1": 0.0
132
+ },
133
+ "passport_number": {
134
+ "precision": 0.0,
135
+ "recall": 0.0,
136
+ "f1": 0.0
137
+ },
138
+ "driver_license_number": {
139
+ "precision": 0.0,
140
+ "recall": 0.0,
141
+ "f1": 0.0
142
+ },
143
+ "certificate_license_number": {
144
+ "precision": 0.0,
145
+ "recall": 0.0,
146
+ "f1": 0.0
147
+ },
148
+ "tax_id": {
149
+ "precision": 0.0,
150
+ "recall": 0.0,
151
+ "f1": 0.0
152
+ },
153
+ "bank_routing_number": {
154
+ "precision": 0.0,
155
+ "recall": 0.0,
156
+ "f1": 0.0
157
+ },
158
+ "iban": {
159
+ "precision": 0.0,
160
+ "recall": 0.0,
161
+ "f1": 0.0
162
+ },
163
+ "bban": {
164
+ "precision": 0.0,
165
+ "recall": 0.0,
166
+ "f1": 0.0
167
+ },
168
+ "swift_bic_code": {
169
+ "precision": 0.0,
170
+ "recall": 0.0,
171
+ "f1": 0.0
172
+ },
173
+ "account_number": {
174
+ "precision": 0.0,
175
+ "recall": 0.0,
176
+ "f1": 0.0
177
+ },
178
+ "customer_id": {
179
+ "precision": 0.0,
180
+ "recall": 0.0,
181
+ "f1": 0.0
182
+ },
183
+ "employee_id": {
184
+ "precision": 0.0,
185
+ "recall": 0.0,
186
+ "f1": 0.0
187
+ },
188
+ "student_id": {
189
+ "precision": 0.0,
190
+ "recall": 0.0,
191
+ "f1": 0.0
192
+ },
193
+ "patient_id": {
194
+ "precision": 0.0,
195
+ "recall": 0.0,
196
+ "f1": 0.0
197
+ },
198
+ "unique_identifier": {
199
+ "precision": 0.0,
200
+ "recall": 0.0,
201
+ "f1": 0.0
202
+ },
203
+ "api_key": {
204
+ "precision": 0.0,
205
+ "recall": 0.0,
206
+ "f1": 0.0
207
+ },
208
+ "access_token": {
209
+ "precision": 0.0,
210
+ "recall": 0.0,
211
+ "f1": 0.0
212
+ },
213
+ "password": {
214
+ "precision": 0.0,
215
+ "recall": 0.0,
216
+ "f1": 0.0
217
+ },
218
+ "pin": {
219
+ "precision": 0.0,
220
+ "recall": 0.0,
221
+ "f1": 0.0
222
+ },
223
+ "ipv4": {
224
+ "precision": 0.0,
225
+ "recall": 0.0,
226
+ "f1": 0.0
227
+ },
228
+ "ipv6": {
229
+ "precision": 0.0,
230
+ "recall": 0.0,
231
+ "f1": 0.0
232
+ },
233
+ "ip_address": {
234
+ "precision": 0.0,
235
+ "recall": 0.0,
236
+ "f1": 0.0
237
+ },
238
+ "mac_address": {
239
+ "precision": 0.0,
240
+ "recall": 0.0,
241
+ "f1": 0.0
242
+ },
243
+ "device_id": {
244
+ "precision": 0.034482758620689655,
245
+ "recall": 0.09090909090909091,
246
+ "f1": 0.05
247
+ },
248
+ "imei": {
249
+ "precision": 0.0,
250
+ "recall": 0.0,
251
+ "f1": 0.0
252
+ },
253
+ "imsi": {
254
+ "precision": 0.0,
255
+ "recall": 0.0,
256
+ "f1": 0.0
257
+ },
258
+ "vehicle_vin": {
259
+ "precision": 0.047619047619047616,
260
+ "recall": 0.09090909090909091,
261
+ "f1": 0.0625
262
+ },
263
+ "license_plate": {
264
+ "precision": 0.0,
265
+ "recall": 0.0,
266
+ "f1": 0.0
267
+ },
268
+ "credit_card_number": {
269
+ "precision": 0.0,
270
+ "recall": 0.0,
271
+ "f1": 0.0
272
+ },
273
+ "credit_card_security_code": {
274
+ "precision": 0.0,
275
+ "recall": 0.0,
276
+ "f1": 0.0
277
+ },
278
+ "credit_card_expiration": {
279
+ "precision": 0.0,
280
+ "recall": 0.0,
281
+ "f1": 0.0
282
+ },
283
+ "cardholder_name": {
284
+ "precision": 0.0,
285
+ "recall": 0.0,
286
+ "f1": 0.0
287
+ },
288
+ "card_brand": {
289
+ "precision": 0.0,
290
+ "recall": 0.0,
291
+ "f1": 0.0
292
+ },
293
+ "company": {
294
+ "precision": 0.34114888628370454,
295
+ "recall": 0.5384457236842105,
296
+ "f1": 0.41767004226138266
297
+ },
298
+ "organization": {
299
+ "precision": 0.17803721269609632,
300
+ "recall": 0.29701765063907487,
301
+ "f1": 0.22262773722627738
302
+ },
303
+ "job_title": {
304
+ "precision": 0.1206896551724138,
305
+ "recall": 0.029288702928870293,
306
+ "f1": 0.04713804713804714
307
+ },
308
+ "occupation": {
309
+ "precision": 0.0,
310
+ "recall": 0.0,
311
+ "f1": 0.0
312
+ },
313
+ "education_level": {
314
+ "precision": 0.0,
315
+ "recall": 0.0,
316
+ "f1": 0.0
317
+ },
318
+ "employment_status": {
319
+ "precision": 0.0,
320
+ "recall": 0.0,
321
+ "f1": 0.0
322
+ },
323
+ "url": {
324
+ "precision": 0.3972602739726027,
325
+ "recall": 0.07571801566579635,
326
+ "f1": 0.12719298245614036
327
+ },
328
+ "http_cookie": {
329
+ "precision": 0.0,
330
+ "recall": 0.0,
331
+ "f1": 0.0
332
+ },
333
+ "language": {
334
+ "precision": 0.06274509803921569,
335
+ "recall": 0.21052631578947367,
336
+ "f1": 0.09667673716012085
337
+ },
338
+ "medical_record_number": {
339
+ "precision": 0.2857142857142857,
340
+ "recall": 0.0273972602739726,
341
+ "f1": 0.05
342
+ },
343
+ "health_plan_beneficiary_number": {
344
+ "precision": 0.0,
345
+ "recall": 0.0,
346
+ "f1": 0.0
347
+ },
348
+ "insurance_id": {
349
+ "precision": 0.0,
350
+ "recall": 0.0,
351
+ "f1": 0.0
352
+ },
353
+ "provider_name": {
354
+ "precision": 0.0,
355
+ "recall": 0.0,
356
+ "f1": 0.0
357
+ },
358
+ "hospital_name": {
359
+ "precision": 0.19672131147540983,
360
+ "recall": 0.7058823529411765,
361
+ "f1": 0.30769230769230765
362
+ },
363
+ "diagnosis": {
364
+ "precision": 0.0472972972972973,
365
+ "recall": 0.3181818181818182,
366
+ "f1": 0.08235294117647059
367
+ },
368
+ "procedure": {
369
+ "precision": 0.14432989690721648,
370
+ "recall": 0.3684210526315789,
371
+ "f1": 0.20740740740740737
372
+ },
373
+ "medication": {
374
+ "precision": 0.006024096385542169,
375
+ "recall": 0.1,
376
+ "f1": 0.011363636363636364
377
+ },
378
+ "lab_result": {
379
+ "precision": 0.0,
380
+ "recall": 0.0,
381
+ "f1": 0.0
382
+ },
383
+ "admission_date": {
384
+ "precision": 0.0,
385
+ "recall": 0.0,
386
+ "f1": 0.0
387
+ },
388
+ "discharge_date": {
389
+ "precision": 0.0,
390
+ "recall": 0.0,
391
+ "f1": 0.0
392
+ },
393
+ "room_number": {
394
+ "precision": 0.0,
395
+ "recall": 0.0,
396
+ "f1": 0.0
397
+ },
398
+ "blood_type": {
399
+ "precision": 0.0,
400
+ "recall": 0.0,
401
+ "f1": 0.0
402
+ },
403
+ "biometric_identifier": {
404
+ "precision": 0.0,
405
+ "recall": 0.0,
406
+ "f1": 0.0
407
+ },
408
+ "race_ethnicity": {
409
+ "precision": 0.04477611940298507,
410
+ "recall": 0.2,
411
+ "f1": 0.07317073170731708
412
+ },
413
+ "religious_belief": {
414
+ "precision": 0.0,
415
+ "recall": 0.0,
416
+ "f1": 0.0
417
+ },
418
+ "political_view": {
419
+ "precision": 0.0,
420
+ "recall": 0.0,
421
+ "f1": 0.0
422
+ },
423
+ "sexual_orientation": {
424
+ "precision": 0.0,
425
+ "recall": 0.0,
426
+ "f1": 0.0
427
+ },
428
+ "product": {
429
+ "precision": 0.1928020565552699,
430
+ "recall": 0.09351620947630923,
431
+ "f1": 0.12594458438287154
432
+ },
433
+ "event": {
434
+ "precision": 0.1327683615819209,
435
+ "recall": 0.14826498422712933,
436
+ "f1": 0.14008941877794334
437
+ },
438
+ "facility": {
439
+ "precision": 0.12393162393162394,
440
+ "recall": 0.19594594594594594,
441
+ "f1": 0.1518324607329843
442
+ },
443
+ "law": {
444
+ "precision": 0.2972972972972973,
445
+ "recall": 0.43137254901960786,
446
+ "f1": 0.35200000000000004
447
+ },
448
+ "work_of_art": {
449
+ "precision": 0.014925373134328358,
450
+ "recall": 0.03636363636363636,
451
+ "f1": 0.021164021164021163
452
+ }
453
+ },
454
+ "by_language": {
455
+ "Vietnamese": {
456
+ "precision": 0.32107843137254904,
457
+ "recall": 0.2525706940874036,
458
+ "f1": 0.2827338129496403
459
+ },
460
+ "Indonesian": {
461
+ "precision": 0.2577040298002032,
462
+ "recall": 0.2760246644903881,
463
+ "f1": 0.26654991243432574
464
+ },
465
+ "Korean": {
466
+ "precision": 0.25502008032128515,
467
+ "recall": 0.24517374517374518,
468
+ "f1": 0.25
469
+ },
470
+ "Italian": {
471
+ "precision": 0.2878787878787879,
472
+ "recall": 0.30297661233167966,
473
+ "f1": 0.29523480662983426
474
+ },
475
+ "English": {
476
+ "precision": 0.2993006993006993,
477
+ "recall": 0.3868696479543292,
478
+ "f1": 0.3374974061008508
479
+ },
480
+ "Japanese": {
481
+ "precision": 0.24390243902439024,
482
+ "recall": 0.22966507177033493,
483
+ "f1": 0.23656973878758009
484
+ },
485
+ "Chinese": {
486
+ "precision": 0.18231336110221083,
487
+ "recall": 0.21577550246492225,
488
+ "f1": 0.1976380687738798
489
+ },
490
+ "German": {
491
+ "precision": 0.27699917559769166,
492
+ "recall": 0.32010161956176564,
493
+ "f1": 0.2969946965232763
494
+ },
495
+ "French": {
496
+ "precision": 0.27207062600321025,
497
+ "recall": 0.25808907499048345,
498
+ "f1": 0.2648954873998828
499
+ },
500
+ "Dutch": {
501
+ "precision": 0.2728512960436562,
502
+ "recall": 0.29806259314456035,
503
+ "f1": 0.28490028490028485
504
+ },
505
+ "Swedish": {
506
+ "precision": 0.2933467741935484,
507
+ "recall": 0.3446846313295825,
508
+ "f1": 0.3169503063308373
509
+ },
510
+ "Spanish": {
511
+ "precision": 0.33671065032987746,
512
+ "recall": 0.33822485207100594,
513
+ "f1": 0.33746605266265206
514
+ }
515
+ }
516
+ }
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b491f6f597404a78c5ca2e4b25f9ca0eeb1c4bd015c610869c242e27022e6551
3
+ size 1155879495
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:878c7e496f4dfdb03194ba7ba49b67b5092da20221c41c8a3887e5bc07b76183
3
+ size 16033785
tokenizer_config.json ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": true,
3
+ "backend": "tokenizers",
4
+ "bos_token": "[CLS]",
5
+ "cls_token": "[CLS]",
6
+ "do_lower_case": false,
7
+ "eos_token": "[SEP]",
8
+ "extra_special_tokens": [
9
+ "[PAD]",
10
+ "[CLS]",
11
+ "[SEP]",
12
+ "▁<extra_id_99>",
13
+ "▁<extra_id_98>",
14
+ "▁<extra_id_97>",
15
+ "▁<extra_id_96>",
16
+ "▁<extra_id_95>",
17
+ "▁<extra_id_94>",
18
+ "▁<extra_id_93>",
19
+ "▁<extra_id_92>",
20
+ "▁<extra_id_91>",
21
+ "▁<extra_id_90>",
22
+ "▁<extra_id_89>",
23
+ "▁<extra_id_88>",
24
+ "▁<extra_id_87>",
25
+ "▁<extra_id_86>",
26
+ "▁<extra_id_85>",
27
+ "▁<extra_id_84>",
28
+ "▁<extra_id_83>",
29
+ "▁<extra_id_82>",
30
+ "▁<extra_id_81>",
31
+ "▁<extra_id_80>",
32
+ "▁<extra_id_79>",
33
+ "▁<extra_id_78>",
34
+ "▁<extra_id_77>",
35
+ "▁<extra_id_76>",
36
+ "▁<extra_id_75>",
37
+ "▁<extra_id_74>",
38
+ "▁<extra_id_73>",
39
+ "▁<extra_id_72>",
40
+ "▁<extra_id_71>",
41
+ "▁<extra_id_70>",
42
+ "▁<extra_id_69>",
43
+ "▁<extra_id_68>",
44
+ "▁<extra_id_67>",
45
+ "▁<extra_id_66>",
46
+ "▁<extra_id_65>",
47
+ "▁<extra_id_64>",
48
+ "▁<extra_id_63>",
49
+ "▁<extra_id_62>",
50
+ "▁<extra_id_61>",
51
+ "▁<extra_id_60>",
52
+ "▁<extra_id_59>",
53
+ "▁<extra_id_58>",
54
+ "▁<extra_id_57>",
55
+ "▁<extra_id_56>",
56
+ "▁<extra_id_55>",
57
+ "▁<extra_id_54>",
58
+ "▁<extra_id_53>",
59
+ "▁<extra_id_52>",
60
+ "▁<extra_id_51>",
61
+ "▁<extra_id_50>",
62
+ "▁<extra_id_49>",
63
+ "▁<extra_id_48>",
64
+ "▁<extra_id_47>",
65
+ "▁<extra_id_46>",
66
+ "▁<extra_id_45>",
67
+ "▁<extra_id_44>",
68
+ "▁<extra_id_43>",
69
+ "▁<extra_id_42>",
70
+ "▁<extra_id_41>",
71
+ "▁<extra_id_40>",
72
+ "▁<extra_id_39>",
73
+ "▁<extra_id_38>",
74
+ "▁<extra_id_37>",
75
+ "▁<extra_id_36>",
76
+ "▁<extra_id_35>",
77
+ "▁<extra_id_34>",
78
+ "▁<extra_id_33>",
79
+ "▁<extra_id_32>",
80
+ "▁<extra_id_31>",
81
+ "▁<extra_id_30>",
82
+ "▁<extra_id_29>",
83
+ "▁<extra_id_28>",
84
+ "▁<extra_id_27>",
85
+ "▁<extra_id_26>",
86
+ "▁<extra_id_25>",
87
+ "▁<extra_id_24>",
88
+ "▁<extra_id_23>",
89
+ "▁<extra_id_22>",
90
+ "▁<extra_id_21>",
91
+ "▁<extra_id_20>",
92
+ "▁<extra_id_19>",
93
+ "▁<extra_id_18>",
94
+ "▁<extra_id_17>",
95
+ "▁<extra_id_16>",
96
+ "▁<extra_id_15>",
97
+ "▁<extra_id_14>",
98
+ "▁<extra_id_13>",
99
+ "▁<extra_id_12>",
100
+ "▁<extra_id_11>",
101
+ "▁<extra_id_10>",
102
+ "▁<extra_id_9>",
103
+ "▁<extra_id_8>",
104
+ "▁<extra_id_7>",
105
+ "▁<extra_id_6>",
106
+ "▁<extra_id_5>",
107
+ "▁<extra_id_4>",
108
+ "▁<extra_id_3>",
109
+ "▁<extra_id_2>",
110
+ "▁<extra_id_1>",
111
+ "▁<extra_id_0>"
112
+ ],
113
+ "is_local": false,
114
+ "mask_token": "[MASK]",
115
+ "model_max_length": 1000000000000000019884624838656,
116
+ "pad_token": "[PAD]",
117
+ "sep_token": "[SEP]",
118
+ "split_by_punct": false,
119
+ "tokenizer_class": "DebertaV2Tokenizer",
120
+ "unk_id": 3,
121
+ "unk_token": "[UNK]",
122
+ "vocab_type": "spm"
123
+ }
training_config.json ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "base_model": "urchade/gliner_multi-v2.1",
3
+ "output_dir": "models/gliner-pii-silver-v1",
4
+ "max_length": 384,
5
+ "max_width": 12,
6
+ "per_device_train_batch_size": 16,
7
+ "per_device_eval_batch_size": 8,
8
+ "gradient_accumulation_steps": 4,
9
+ "learning_rate": 1e-05,
10
+ "others_lr": 5e-05,
11
+ "weight_decay": 0.01,
12
+ "others_weight_decay": 0.01,
13
+ "num_train_epochs": 2,
14
+ "max_steps": -1,
15
+ "warmup_ratio": 0.05,
16
+ "lr_scheduler_type": "linear",
17
+ "focal_loss_alpha": -1.0,
18
+ "focal_loss_gamma": 0.0,
19
+ "focal_loss_prob_margin": 0.0,
20
+ "label_smoothing": 0.05,
21
+ "loss_reduction": "sum",
22
+ "negatives": 1.0,
23
+ "masking": "global",
24
+ "logging_steps": 50,
25
+ "eval_steps": 2000,
26
+ "save_steps": 2000,
27
+ "eval_strategy": "steps",
28
+ "save_strategy": "steps",
29
+ "save_total_limit": 2,
30
+ "fp16": true,
31
+ "bf16": false,
32
+ "use_cpu": false,
33
+ "gradient_checkpointing": false,
34
+ "max_train_samples": null,
35
+ "max_eval_samples": null,
36
+ "resume_from_checkpoint": null,
37
+ "eval_threshold": 0.6,
38
+ "eval_after_train": true,
39
+ "eval_output_path": null,
40
+ "eval_test_path": null,
41
+ "compare_base_model": true,
42
+ "benchmark_models": [
43
+ "nvidia/gliner-PII",
44
+ "gretelai/gretel-gliner-bi-base-v1.0",
45
+ "knowledgator/gliner-pii-base-v1.0"
46
+ ],
47
+ "dataset_stats_path": null,
48
+ "model_card_title": null,
49
+ "model_card_tags": null,
50
+ "license": "apache-2.0",
51
+ "ddp_find_unused_parameters": false,
52
+ "seed": 42
53
+ }