Token Classification
Transformers
Safetensors
English
deberta-v2
ner
phi
pii
privacy
healthcare
deidentification
security
compliance
synthetic-data
deberta-v3
Eval Results (legacy)
Instructions to use bharathjanumpally/phi-span-detector-deberta-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use bharathjanumpally/phi-span-detector-deberta-v3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="bharathjanumpally/phi-span-detector-deberta-v3")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("bharathjanumpally/phi-span-detector-deberta-v3") model = AutoModelForTokenClassification.from_pretrained("bharathjanumpally/phi-span-detector-deberta-v3") - Notebooks
- Google Colab
- Kaggle
File size: 852 Bytes
f618ccc | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | from transformers import pipeline
MODEL_ID = "bharathjanumpally/phi-span-detector-deberta-v3"
def redact_text(text: str) -> tuple[list[dict], str]:
ner = pipeline(
"token-classification",
model=MODEL_ID,
aggregation_strategy="simple",
)
spans = ner(text)
redacted = text
for item in sorted(spans, key=lambda x: x["start"], reverse=True):
label = item["entity_group"]
redacted = redacted[: item["start"]] + f"[{label}]" + redacted[item["end"] :]
return spans, redacted
if __name__ == "__main__":
sample = (
"Patient John Smith (MRN: 001-23-4567) visited "
"Boston Medical Center on 12/19/2025."
)
spans, redacted = redact_text(sample)
print("Spans:")
for span in spans:
print(span)
print()
print("Redacted:")
print(redacted)
|