Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,77 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
language:
|
| 3 |
+
- nb
|
| 4 |
+
tags:
|
| 5 |
+
- spacy
|
| 6 |
+
- token-classification
|
| 7 |
+
- ner
|
| 8 |
+
library_name: spacy
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# declassifai_ner_pii
|
| 12 |
+
|
| 13 |
+
A spaCy NER model.
|
| 14 |
+
|
| 15 |
+
## Installation
|
| 16 |
+
|
| 17 |
+
```bash
|
| 18 |
+
pip install https://huggingface.co/yasin9999/declassifai_ner_pii/resolve/main/declassifai_ner_pii-1.0.0-py3-none-any.whl
|
| 19 |
+
```
|
| 20 |
+
|
| 21 |
+
## Usage
|
| 22 |
+
|
| 23 |
+
```python
|
| 24 |
+
import spacy
|
| 25 |
+
|
| 26 |
+
nlp = spacy.load("declassifai_ner_pii")
|
| 27 |
+
doc = nlp("Your text here")
|
| 28 |
+
|
| 29 |
+
for ent in doc.ents:
|
| 30 |
+
print(ent.text, ent.label_)
|
| 31 |
+
```
|
| 32 |
+
|
| 33 |
+
## Model Details
|
| 34 |
+
|
| 35 |
+
- **Language**: nb
|
| 36 |
+
- **Pipeline**: tok2vec, ner
|
| 37 |
+
- **Version**: 1.0.0
|
| 38 |
+
|
| 39 |
+
## Label Scheme
|
| 40 |
+
|
| 41 |
+
### tok2vec
|
| 42 |
+
|
| 43 |
+
### ner
|
| 44 |
+
- `CONTEXT_SENSITIVE`
|
| 45 |
+
- `CRIMINAL_RECORD`
|
| 46 |
+
- `DATE_TIME`
|
| 47 |
+
- `EMAIL_ADDRESS`
|
| 48 |
+
- `EMPLOYMENT_INFO`
|
| 49 |
+
- `EVENT`
|
| 50 |
+
- `FAMILY_RELATION`
|
| 51 |
+
- `FINANCIAL_INFO`
|
| 52 |
+
- `GOV_ID`
|
| 53 |
+
- `HEALTH_INFO`
|
| 54 |
+
- `IDENTIFIABLE_IMAGE`
|
| 55 |
+
- `LOC`
|
| 56 |
+
- `MISC`
|
| 57 |
+
- `NO_ADDRESS`
|
| 58 |
+
- `NO_PHONE_NUMBER`
|
| 59 |
+
- `ORG`
|
| 60 |
+
- `PERSON`
|
| 61 |
+
- `POLITICAL_CASE`
|
| 62 |
+
- `POSTAL_CODE`
|
| 63 |
+
- `PRODUCT`
|
| 64 |
+
- `SEXUAL_ORIENTATION`
|
| 65 |
+
|
| 66 |
+
|
| 67 |
+
## Evaluation
|
| 68 |
+
|
| 69 |
+
| Metric | Score |
|
| 70 |
+
|--------|-------|
|
| 71 |
+
| ents_f | 0.7164 |
|
| 72 |
+
| ents_p | 0.7060 |
|
| 73 |
+
| ents_r | 0.7271 |
|
| 74 |
+
| ents_per_type | {'PERSON': {'p': 0.8745454545, 'r': 0.7872340426, 'f': 0.8285960379}, 'ORG': {'p': 0.7494407159, 'r': 0.8292079208, 'f': 0.7873090482}, 'PRODUCT': {'p': 0.7333333333, 'r': 0.3395061728, 'f': 0.4641350211}, 'MISC': {'p': 0.813559322, 'r': 0.6075949367, 'f': 0.6956521739}, 'LOC': {'p': 0.2945736434, 'r': 0.6972477064, 'f': 0.4141689373}, 'DATE_TIME': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'FAMILY_RELATION': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'POLITICAL_CASE': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'EVENT': {'p': 0.5714285714, 'r': 0.4444444444, 'f': 0.5}, 'HEALTH_INFO': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'SEXUAL_ORIENTATION': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'CONTEXT_SENSITIVE': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'EMPLOYMENT_INFO': {'p': 0.0, 'r': 0.0, 'f': 0.0}, 'EMAIL_ADDRESS': {'p': 0.0, 'r': 0.0, 'f': 0.0}} |
|
| 75 |
+
| tok2vec_loss | 15093.0559 |
|
| 76 |
+
| ner_loss | 28938.1309 |
|
| 77 |
+
|