Token Classification
Transformers
Safetensors
Italian
modernbert
named-entity-recognition
pii
de-identification
privacy
medical
clinical
names
Instructions to use Swisscoding-Technologies/pii-IT-name-filter-149M with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Swisscoding-Technologies/pii-IT-name-filter-149M with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="Swisscoding-Technologies/pii-IT-name-filter-149M")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("Swisscoding-Technologies/pii-IT-name-filter-149M") model = AutoModelForTokenClassification.from_pretrained("Swisscoding-Technologies/pii-IT-name-filter-149M") - Notebooks
- Google Colab
- Kaggle
Swisscoding Name Filter
Swisscoding Name Filter is a family of ModernBERT-base token classifiers for detecting personal names in English, German, French, and Italian.
Benchmarks
Scores are percentages.
| Model | Precision | Recall | F1 |
|---|---|---|---|
| MultiGraSCCo - German | |||
pii-DE-name-filter-149M | 97.20 | 99.10 | 98.14 |
pii-name-filter-149M | 96.92 | 99.40 | 98.14 |
OpenMed/OpenMed-PII-German-SuperClinical-Large-434M-v1 | 98.07* | 56.36* | 71.58* |
OpenMed/OpenMed-PII-SuperClinical-Large-434M-v1 | 93.13* | 75.35* | 83.31* |
openai/privacy-filter | 68.35* | 66.03* | 67.17* |
| MultiGraSCCo - French | |||
pii-FR-name-filter-149M | 99.35 | 97.68 | 98.51 |
pii-name-filter-149M | 97.43 | 97.04 | 97.24 |
OpenMed/OpenMed-PII-French-SuperClinical-Large-434M-v1 | 97.21* | 44.59* | 61.13* |
OpenMed/OpenMed-PII-SuperClinical-Large-434M-v1 | 93.30* | 85.91* | 89.45* |
openai/privacy-filter | 78.87* | 67.37* | 72.67* |
| MultiGraSCCo - Italian | |||
pii-IT-name-filter-149M | 96.90 | 99.59 | 98.23 |
pii-name-filter-149M | 98.31 | 99.59 | 98.95 |
OpenMed/OpenMed-PII-Italian-SuperClinical-Large-434M-v1 | 95.22* | 35.14* | 51.34* |
OpenMed/OpenMed-PII-SuperClinical-Large-434M-v1 | 91.65* | 86.43* | 88.97* |
openai/privacy-filter | 81.16* | 68.55* | 74.32* |
| MultiGraSCCo - English | |||
pii-EN-name-filter-149M | 100.00 | 99.72 | 99.86 |
pii-name-filter-149M | 100.00 | 99.16 | 99.58 |
OpenMed/OpenMed-PII-SuperClinical-Large-434M-v1 | 98.39* | 94.82* | 96.57* |
openai/privacy-filter | 97.73* | 90.15* | 93.79* |
| MultiGraSCCo - Multilingual (German, French, Italian, English) | |||
pii-name-filter-149M | 97.92 | 98.76 | 98.34 |
OpenMed/OpenMed-PII-SuperClinical-Large-434M-v1 | 93.85* | 84.36* | 88.85* |
openai/privacy-filter | 79.47* | 71.19* | 75.10* |
| Nemotron PII | |||
pii-EN-name-filter-149M | 94.56 | 99.70 | 97.06 |
pii-name-filter-149M | 93.94 | 99.58 | 96.68 |
OpenMed/OpenMed-PII-SuperClinical-Large-434M-v1 first_name | 99.48 | 99.51 | 99.50 |
OpenMed/OpenMed-PII-SuperClinical-Large-434M-v1 last_name | 99.42 | 99.29 | 99.35 |
nvidia/gliner-PII | - | - | 87.00 |
*MultiGraSCCo OpenMed and Privacy Filter scores were evaluated by us and are not official results released by their organizations.
| Model | A100 throughput |
|---|---|
| Speed | |
pii-EN-name-filter-149M | 39.00 examples/sec |
OpenMed/OpenMed-PII-SuperClinical-Large-434M-v1 | 22.13 examples/sec |
openai/privacy-filter | 0.29 examples/sec (3.42 sec/example) |
nvidia/gliner-PII | not measured |
A100 benchmark over 1,000 Nemotron examples.
Quick Use
from transformers import pipeline
model_id = "swisscoding/pii-IT-name-filter-149M"
name_detector = pipeline(
"token-classification",
model=model_id,
aggregation_strategy="simple",
)
text = "La paziente Alice Rossi e stata inviata dal Dr Marco Weber per un controllo."
print(name_detector(text))
- Downloads last month
- -
Model tree for Swisscoding-Technologies/pii-IT-name-filter-149M
Base model
answerdotai/ModernBERT-base