Kiji PII Detection Model (ONNX Quantized)

INT8-quantized ONNX version of the Kiji PII detection model for efficient CPU inference. Detects Personally Identifiable Information (PII) in text with coreference resolution.

Source Model

This is a quantized version of DataikuNLP/kiji-pii-model — a multi-task DistilBERT model fine-tuned for PII detection with coreference resolution.

Model Summary


Format	ONNX (INT8 quantized)
Architecture	Shared DistilBERT encoder + two classification heads
Tasks	PII token classification (53 labels) + coreference detection (7 labels)
PII entity types	26
Max sequence length	512 tokens
Runtime	ONNX Runtime

Files

File	Size
`model_quantized.onnx`	63.3 MB
`model.onnx.data`	248.9 MB
`ort_config.json`	0.7 KB
`label_mappings.json`	2.9 KB
`model_manifest.json`	1.6 KB
`tokenizer_config.json`	1.3 KB
`tokenizer.json`	653.2 KB
`vocab.txt`	208.4 KB
`special_tokens_map.json`	0.7 KB

Quantization Details


Method	Dynamic quantization (ONNX Runtime / Optimum)
Weights	QInt8 (symmetric, per-channel)
Activations	QUInt8 (asymmetric, per-tensor)
Mode	IntegerOps
Format	QOperator
Operators quantized	Conv, MatMul, Attention, LSTM, Gather, Transpose, EmbedLayerNormalization

Usage

import numpy as np
from onnxruntime import InferenceSession
from transformers import AutoTokenizer

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("DataikuNLP/kiji-pii-model-onnx")
session = InferenceSession("DataikuNLP/kiji-pii-model-onnx/model_quantized.onnx")  # or local path

# Tokenize
text = "Contact John Smith at john.smith@example.com or call +1-555-123-4567."
inputs = tokenizer(text, return_tensors="np", truncation=True, max_length=512)

# Run inference
outputs = session.run(None, dict(inputs))
pii_logits, coref_logits = outputs  # (1, seq_len, 53), (1, seq_len, 7)

# Decode PII predictions
pii_predictions = np.argmax(pii_logits, axis=-1)[0]

# See label_mappings.json for label ID -> label name mapping

PII Labels (BIO tagging)

The model uses BIO tagging with 26 entity types:

Label	Description
`AGE`	Age
`BUILDINGNUM`	Building number
`CITY`	City
`COMPANYNAME`	Company name
`COUNTRY`	Country
`CREDITCARDNUMBER`	Credit Card Number
`DATEOFBIRTH`	Date of birth
`DRIVERLICENSENUM`	Driver's License Number
`EMAIL`	Email
`FIRSTNAME`	First name
`IBAN`	IBAN
`IDCARDNUM`	ID Card Number
`LICENSEPLATENUM`	License Plate Number
`NATIONALID`	National ID
`PASSPORTID`	Passport ID
`PASSWORD`	Password
`PHONENUMBER`	Phone number
`SECURITYTOKEN`	API Security Tokens
`SSN`	Social Security Number
`STATE`	State
`STREET`	Street
`SURNAME`	Last name
`TAXNUM`	Tax Number
`URL`	URL
`USERNAME`	Username
`ZIP`	Zip code

Each entity type has B- (beginning) and I- (inside) variants, plus O for non-PII tokens.

Coreference Labels

Label	Description
`NO_COREF`	Token is not part of a coreference cluster
`CLUSTER_0`-`CLUSTER_3`	Token belongs to coreference cluster 0-3

Training Data

The source model was trained on the DataikuNLP/kiji-pii-training-data dataset — a synthetic multilingual PII dataset with entity annotations and coreference resolution.

Lineage

Stage	Repository
Dataset	DataikuNLP/kiji-pii-training-data
Trained model	DataikuNLP/kiji-pii-model
Quantized model	DataikuNLP/kiji-pii-model-onnx (this repo)

Limitations

Trained on synthetically generated data — may not generalize to all real-world text
Coreference head supports up to 4 clusters per sequence
Optimized for the 6 languages in the training data (English, German, French, Spanish, Dutch, Danish)
Max sequence length is 512 tokens
Quantization may slightly reduce accuracy compared to the full-precision model

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for DataikuNLP/kiji-pii-model-onnx

Base model

microsoft/deberta-v3-small

Quantized

DataikuNLP/kiji-pii-model

Quantized

(1)

this model