Dymium PII Named Entity Recognition
Fine-tuned DeBERTa-v3-large for high-accuracy PII detection in enterprise and AI pipeline contexts.
Developed by Dymium โ AI data security platform enabling zero-copy data access with built-in governance and compliance.
Entities
13 PII entity types:
| Entity | Description |
|---|---|
| ADDRESS | Physical addresses |
| AUTH | Credentials, passwords, API keys |
| DATE | Dates of birth and other sensitive dates |
| Email addresses | |
| ID_DOC | Passport, driver's license numbers |
| ID_FIN | Financial identifiers (account, card numbers) |
| ID_GOV | Government identifiers (SSN, tax IDs) |
| ID_REF | Internal reference identifiers |
| ORG | Organization names |
| PERSON | Personal names |
| PHONE | Phone numbers |
| TIME | Timestamps |
| URL | URLs |
Intended Use
- PII detection and redaction in AI pipelines
- Data governance and compliance enforcement (GDPR, HIPAA, FedRAMP)
- Sensitive data discovery before feeding to LLMs
- Real-time data access control
Usage
from transformers import pipeline
ner = pipeline("token-classification",
model="dymium/dymium-pii-ner",
aggregation_strategy="simple")
result = ner("Contact John Smith at john@example.com or call 555-123-4567")
print(result)
Performance
| Metric | Score |
|---|---|
| F1 | [add] |
| Precision | [add] |
| Recall | [add] |
Limitations
- English language only
- Performance may vary on highly domain-specific text
- AUTH entity detection depends on context availability
About Dymium
Dymium eliminates data movement risk by enabling AI agents and analytics tools to query sensitive data in place โ no copying, full governance, FedRAMP-ready.
- Downloads last month
- 22
Model tree for dymium/Dymium-NER-v1
Base model
microsoft/deberta-v3-large