SecretGuard NER v3 — PII Detection Model

A distilled 3-layer DeBERTa-v3-small model for detecting Personally Identifiable Information (PII) in text. Designed for low-latency production inference in pure Rust — no ONNX, no Python, no C++ dependencies at runtime.

Part of the LLM SecretGuard project.

Entity Types

Label	Description	Examples
`person`	Full names, given names, surnames	"John Smith", "jean-pierre dupont"
`address`	Street addresses, cities, postal codes	"42 Baker Street, London"
`date of birth`	Birth dates in various formats	"1990-01-15", "15/03/1985"
`passport number`	Passport and ID card numbers	"AB1234567"

Architecture

DeBERTa-v3-small (3 layers, 768 hidden, disentangled attention)
  → Projection: Linear(768, 512)
  → Conv1D(512, 512, kernel=3, padding=1) + ReLU
  → LabelCrossAttention(512, 4 heads)
  → GlobalPointer: start_proj(512) + end_proj(512)

This is a custom architecture — not GLiNER. The DeBERTa backbone is initialized from urchade/gliner_small-v2.1 (layers {0, 2, 4}), but the detection head (Conv1D + CrossAttention + GlobalPointer) is our own design, trained end-to-end with supervised NER loss.

Key Design Choices

Conv1D replaces BiLSTM: parallelizable, single GEMM instead of sequential ops
GlobalPointer replaces SpanMarker: additive start/end scoring, simpler and faster
tanh-GELU activation: matches Rust fused inference engine exactly
Case augmentation (30%): robust to lowercase names ("yann degat" detected at 99%)
Punctuation augmentation (20%): robust to trailing dots/commas ("name....." detected)
Regex word splitting: training tokenization matches Rust inference boundaries

Training Details

Backbone init: Layers {0, 2, 4} from urchade/gliner_small-v2.1
Optimizer: AdamW (lr=3e-5, weight_decay=0.01)
Epochs: 7 (patience=5, target F1=0.82)
Batch size: 8
Loss: BCE with pos_weight=20
Augmentation: 30% case (lower/mixed/upper) + 20% punctuation noise

Training Data (CC-BY-4.0 compatible)

Dataset	Examples	Source
urchade/synthetic-pii-ner-mistral-v1	18,905	HuggingFace
ai4privacy/pii-masking-400k	40,398	HuggingFace (CC-BY-4.0)
International address augmentation (Faker, 20 locales + French templates)	15,000	Generated
Total	~74,300

Model Format

Stored in float16 safetensors format.

File	Description
`model.safetensors`	Model weights (f16, 248MB)
`tokenizer.json`	DeBERTa-v3-small tokenizer
`deberta_config.json`	Backbone config (3 layers, 768 hidden)
`model_config.json`	Head config (max_width=12, hidden=512)

Inference Performance (v0.3.0)

Pure Rust fused inference engine with AVX-512 SIMD.

Mode	Latency	Throughput
Single request (CPU, AVX-512)	47ms	—
8 concurrent requests	110ms avg	72 QPS
ONNX Runtime (same model, MKL)	12ms	—

Optimizations: rayon parallel GEMM, AVX-512 LayerNorm/softmax, fused QKV projections, target-cpu=native.

Training & Reproduction

Full training guide: Training Documentation

git clone https://codeberg.org/roobai/llm-secretguard.git
cd llm-secretguard/training
make all    # setup → data → train → export

Version

Model version: v3 (SecretGuard NER)
Project version: v0.3.0
Framework: PyTorch (training) + Rust (inference)

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

0.1B params

Tensor type

F16

Evaluation results

Entity-level F1 on ai4privacy/pii-masking-400k (validation)
self-reported

0.78