File size: 2,245 Bytes

463bf05

---
license: apache-2.0
language: en
tags:
  - ner
  - pii
  - privacy
  - token-classification
  - deberta
  - onnx
library_name: onnxruntime
pipeline_tag: token-classification
---

# Shade V5 — On-Device PII Detection

Fast, accurate PII (Personally Identifiable Information) detection model for privacy-preserving AI pipelines. Detects 12 entity types with 97.6% F1 score.

## Quick Start

```python
pip install veil-phantom
```

```python
from veil_phantom import VeilClient

veil = VeilClient()  # auto-downloads this model
result = veil.redact("John Smith sent $5M to john@acme.com")
result.sanitized  # "[PERSON_1] sent [AMOUNT_1] to [EMAIL_1]"
```

## Model Details

| Property | Value |
|----------|-------|
| Architecture | DeBERTa-v3-xsmall |
| Parameters | 22M |
| Format | ONNX |
| Size | 270 MB |
| Inference | <50ms on CPU |
| F1 Score | 97.6% (in-distribution) |
| F1 Score | 97.3% (out-of-distribution) |
| Task | BIO Token Classification |
| Labels | 25 (12 entity types × B/I + O) |

## Entity Types

| Type | F1 | Examples |
|------|-----|----------|
| PERSON | 96.3% | Names (Western, African, Asian, South African) |
| ORG | 97.6% | Companies, institutions |
| EMAIL | 100% | Email addresses |
| PHONE | 98.4% | Phone numbers (international formats) |
| MONEY | 99.6% | Monetary amounts |
| DATE | 97.8% | Dates, times, schedules |
| ADDRESS | 99.4% | Street addresses |
| GOVID | 97.7% | SSN, SA ID, passport |
| BANKACCT | 92.9% | Bank account numbers, IBAN |
| CARD | 100% | Credit/debit card numbers |
| IPADDR | 100% | IP addresses |
| CASE | 97.8% | Legal case numbers |

## Training

- **Base model**: microsoft/deberta-v3-xsmall
- **Training data**: 116K examples from business meetings, legal proceedings, financial transactions
- **Tokenizer**: Unigram (128K vocab)
- **OOD gap**: 0.3% (97.6% → 97.3%)

## Files

- `ShadeV5.onnx` — ONNX model (270 MB)
- `tokenizer.json` — HuggingFace fast tokenizer
- `tokenizer_config.json` — Tokenizer configuration
- `shade_label_map.json` — BIO label → entity type mapping

## License

Apache 2.0

## Part of VeilPhantom

This model powers [VeilPhantom](https://github.com/veil-privacy/veil-phantom), an open-source PII redaction SDK for agentic AI pipelines.