Upload folder using huggingface_hub
Browse files
README.md
CHANGED
|
@@ -8,6 +8,96 @@ pipeline_tag: token-classification
|
|
| 8 |
datasets:
|
| 9 |
- urchade/synthetic-pii-ner-mistral-v1
|
| 10 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
|
| 13 |
# Model Card for GLiNER PII
|
|
@@ -16,8 +106,6 @@ GLiNER is a Named Entity Recognition (NER) model capable of identifying any enti
|
|
| 16 |
|
| 17 |
This model has been trained by fine-tuning `urchade/gliner_multi-v2.1` on the `urchade/synthetic-pii-ner-mistral-v1` dataset.
|
| 18 |
|
| 19 |
-
This model is capable of recognizing various types of *personally identifiable information* (PII), including but not limited to these entity types: `person`, `organization`, `phone number`, `address`, `passport number`, `email`, `credit card number`, `social security number`, `health insurance id number`, `date of birth`, `mobile phone number`, `bank account number`, `medication`, `cpf`, `driver's license number`, `tax identification number`, `medical condition`, `identity card number`, `national id number`, `ip address`, `email address`, `iban`, `credit card expiration date`, `username`, `health insurance number`, `registration number`, `student id number`, `insurance number`, `flight number`, `landline phone number`, `blood type`, `cvv`, `reservation number`, `digital signature`, `social media handle`, `license plate number`, `cnpj`, `postal code`, `passport_number`, `serial number`, `vehicle registration number`, `credit card brand`, `fax number`, `visa number`, `insurance company`, `identity document number`, `transaction number`, `national health insurance number`, `cvc`, `birth certificate number`, `train ticket number`, `passport expiration date`, and `social_security_number`.
|
| 20 |
-
|
| 21 |
## Links
|
| 22 |
|
| 23 |
* Paper: https://arxiv.org/abs/2311.08526
|
|
|
|
| 8 |
datasets:
|
| 9 |
- urchade/synthetic-pii-ner-mistral-v1
|
| 10 |
---
|
| 11 |
+
# Entity Types Classification
|
| 12 |
+
|
| 13 |
+
## Personal Information
|
| 14 |
+
- Date of birth
|
| 15 |
+
- Age
|
| 16 |
+
- Gender
|
| 17 |
+
- Last name
|
| 18 |
+
- Occupation
|
| 19 |
+
- Education level
|
| 20 |
+
- Phone number
|
| 21 |
+
- Email
|
| 22 |
+
- Street address
|
| 23 |
+
- City
|
| 24 |
+
- Country
|
| 25 |
+
- Postcode
|
| 26 |
+
- User name
|
| 27 |
+
- Password
|
| 28 |
+
- Tax ID
|
| 29 |
+
- License plate
|
| 30 |
+
- CVV
|
| 31 |
+
- Bank routing number
|
| 32 |
+
- Account number
|
| 33 |
+
- SWIFT BIC
|
| 34 |
+
- Biometric identifier
|
| 35 |
+
- Device identifier
|
| 36 |
+
- Location
|
| 37 |
+
|
| 38 |
+
## Financial Information
|
| 39 |
+
- Account number
|
| 40 |
+
- Bank routing number
|
| 41 |
+
- SWIFT BIC
|
| 42 |
+
- CVV
|
| 43 |
+
- Tax ID
|
| 44 |
+
- API key
|
| 45 |
+
|
| 46 |
+
## Health and Medical Information
|
| 47 |
+
- Blood type
|
| 48 |
+
- Biometric identifier
|
| 49 |
+
- Organ
|
| 50 |
+
- Diseases symptom
|
| 51 |
+
- Diagnostics
|
| 52 |
+
- Preventive medicine
|
| 53 |
+
- Treatment
|
| 54 |
+
- Surgery
|
| 55 |
+
- Drug chemical
|
| 56 |
+
- Medical device technique
|
| 57 |
+
- Personal care
|
| 58 |
+
|
| 59 |
+
## Online and Web-related Information
|
| 60 |
+
- URL
|
| 61 |
+
- IP address
|
| 62 |
+
- Email
|
| 63 |
+
- User name
|
| 64 |
+
- API key
|
| 65 |
+
|
| 66 |
+
## Professional Information
|
| 67 |
+
- Occupation
|
| 68 |
+
- Skill
|
| 69 |
+
- Organization
|
| 70 |
+
- Company name
|
| 71 |
+
|
| 72 |
+
## Location Information
|
| 73 |
+
- City
|
| 74 |
+
- Country
|
| 75 |
+
- Postcode
|
| 76 |
+
- Street address
|
| 77 |
+
- Location
|
| 78 |
+
|
| 79 |
+
## Time-Related Information
|
| 80 |
+
- Date
|
| 81 |
+
- Date time
|
| 82 |
+
|
| 83 |
+
## Miscellaneous
|
| 84 |
+
- Event
|
| 85 |
+
- Miscellaneous
|
| 86 |
+
|
| 87 |
+
## Product and Goods Information
|
| 88 |
+
- Product
|
| 89 |
+
- Quantity
|
| 90 |
+
- Food drink
|
| 91 |
+
- Transportation
|
| 92 |
+
|
| 93 |
+
## Identifiers
|
| 94 |
+
- Device identifier
|
| 95 |
+
- Biometric identifier
|
| 96 |
+
- User name
|
| 97 |
+
- Email
|
| 98 |
+
- Phone number
|
| 99 |
+
- URL
|
| 100 |
+
- License plate
|
| 101 |
|
| 102 |
|
| 103 |
# Model Card for GLiNER PII
|
|
|
|
| 106 |
|
| 107 |
This model has been trained by fine-tuning `urchade/gliner_multi-v2.1` on the `urchade/synthetic-pii-ner-mistral-v1` dataset.
|
| 108 |
|
|
|
|
|
|
|
| 109 |
## Links
|
| 110 |
|
| 111 |
* Paper: https://arxiv.org/abs/2311.08526
|