SecretGuard NER v3 β PII Detection Model
A distilled 3-layer DeBERTa-v3-small model for detecting Personally Identifiable Information (PII) in text. Designed for low-latency production inference in pure Rust β no ONNX, no Python, no C++ dependencies at runtime.
Part of the LLM SecretGuard project.
Entity Types
| Label | Description | Examples |
|---|---|---|
person |
Full names, given names, surnames | "John Smith", "jean-pierre dupont" |
address |
Street addresses, cities, postal codes | "42 Baker Street, London" |
date of birth |
Birth dates in various formats | "1990-01-15", "15/03/1985" |
passport number |
Passport and ID card numbers | "AB1234567" |
Architecture
DeBERTa-v3-small (3 layers, 768 hidden, disentangled attention)
β Projection: Linear(768, 512)
β Conv1D(512, 512, kernel=3, padding=1) + ReLU
β LabelCrossAttention(512, 4 heads)
β GlobalPointer: start_proj(512) + end_proj(512)
This is a custom architecture β not GLiNER. The DeBERTa backbone is initialized from urchade/gliner_small-v2.1 (layers {0, 2, 4}), but the detection head (Conv1D + CrossAttention + GlobalPointer) is our own design, trained end-to-end with supervised NER loss.
Key Design Choices
- Conv1D replaces BiLSTM: parallelizable, single GEMM instead of sequential ops
- GlobalPointer replaces SpanMarker: additive start/end scoring, simpler and faster
- tanh-GELU activation: matches Rust fused inference engine exactly
- Case augmentation (30%): robust to lowercase names ("yann degat" detected at 99%)
- Punctuation augmentation (20%): robust to trailing dots/commas ("name....." detected)
- Regex word splitting: training tokenization matches Rust inference boundaries
Training Details
- Backbone init: Layers {0, 2, 4} from urchade/gliner_small-v2.1
- Optimizer: AdamW (lr=3e-5, weight_decay=0.01)
- Epochs: 7 (patience=5, target F1=0.82)
- Batch size: 8
- Loss: BCE with pos_weight=20
- Augmentation: 30% case (lower/mixed/upper) + 20% punctuation noise
Training Data (CC-BY-4.0 compatible)
| Dataset | Examples | Source |
|---|---|---|
| urchade/synthetic-pii-ner-mistral-v1 | 18,905 | HuggingFace |
| ai4privacy/pii-masking-400k | 40,398 | HuggingFace (CC-BY-4.0) |
| International address augmentation (Faker, 20 locales + French templates) | 15,000 | Generated |
| Total | ~74,300 |
Model Format
Stored in float16 safetensors format.
| File | Description |
|---|---|
model.safetensors |
Model weights (f16, 248MB) |
tokenizer.json |
DeBERTa-v3-small tokenizer |
deberta_config.json |
Backbone config (3 layers, 768 hidden) |
model_config.json |
Head config (max_width=12, hidden=512) |
Inference Performance (v0.3.0)
Pure Rust fused inference engine with AVX-512 SIMD.
| Mode | Latency | Throughput |
|---|---|---|
| Single request (CPU, AVX-512) | 47ms | β |
| 8 concurrent requests | 110ms avg | 72 QPS |
| ONNX Runtime (same model, MKL) | 12ms | β |
Optimizations: rayon parallel GEMM, AVX-512 LayerNorm/softmax, fused QKV projections, target-cpu=native.
Training & Reproduction
Full training guide: Training Documentation
git clone https://codeberg.org/roobai/llm-secretguard.git
cd llm-secretguard/training
make all # setup β data β train β export
Version
- Model version: v3 (SecretGuard NER)
- Project version: v0.3.0
- Framework: PyTorch (training) + Rust (inference)
Evaluation results
- Entity-level F1 on ai4privacy/pii-masking-400k (validation)self-reported0.78