ai4privacy/pii-masking-300k
Viewer โข Updated โข 225k โข 8.89k โข 101
Fine-tuned distilbert-base-cased for PII (Personally Identifiable Information) detection
on the English subset of ai4privacy/pii-masking-300k.
PERSON, EMAIL, PHONE, USERNAME, ID_NUMBER, ADDRESS, IP_ADDRESS,
URL, DATE_TIME, PASSWORD, DEMOGRAPHIC, OTHER
from transformers import pipeline
ner = pipeline("ner", model="munibz/pii-distilbert-en", aggregation_strategy="simple")
text = "Hi, I'm John Smith. Email me at john.smith@gmail.com."
print(ner(text))
Base model
distilbert/distilbert-base-cased