Dymium PII Named Entity Recognition

Fine-tuned DeBERTa-v3-large for high-accuracy PII detection in enterprise and AI pipeline contexts.

Developed by Dymium โ€” AI data security platform enabling zero-copy data access with built-in governance and compliance.

Entities

13 PII entity types:

Entity Description
ADDRESS Physical addresses
AUTH Credentials, passwords, API keys
DATE Dates of birth and other sensitive dates
EMAIL Email addresses
ID_DOC Passport, driver's license numbers
ID_FIN Financial identifiers (account, card numbers)
ID_GOV Government identifiers (SSN, tax IDs)
ID_REF Internal reference identifiers
ORG Organization names
PERSON Personal names
PHONE Phone numbers
TIME Timestamps
URL URLs

Intended Use

  • PII detection and redaction in AI pipelines
  • Data governance and compliance enforcement (GDPR, HIPAA, FedRAMP)
  • Sensitive data discovery before feeding to LLMs
  • Real-time data access control

Usage

from transformers import pipeline

ner = pipeline("token-classification", 
               model="dymium/dymium-pii-ner",
               aggregation_strategy="simple")

result = ner("Contact John Smith at john@example.com or call 555-123-4567")
print(result)

Performance

Metric Score
F1 [add]
Precision [add]
Recall [add]

Limitations

  • English language only
  • Performance may vary on highly domain-specific text
  • AUTH entity detection depends on context availability

About Dymium

Dymium eliminates data movement risk by enabling AI agents and analytics tools to query sensitive data in place โ€” no copying, full governance, FedRAMP-ready.

๐Ÿ”— dymium.io | Blog | Resources

Downloads last month
22
Safetensors
Model size
0.4B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for dymium/Dymium-NER-v1

Finetuned
(242)
this model