OpenMed Privacy Filter Multilingual v2 - MLX BF16

A native MLX port of OpenMed/privacy-filter-multilingual-v2 for Apple Silicon PII detection and de-identification with OpenMed. This is the unquantized BF16 reference artifact. For the 8-bit sibling, see OpenMed/privacy-filter-multilingual-v2-mlx-8bit.

Family at a glance:

PyTorch source: OpenMed/privacy-filter-multilingual-v2

MLX BF16 (this repo): Apple Silicon, full precision, 2.6 GiB weights

MLX 8-bit: OpenMed/privacy-filter-multilingual-v2-mlx-8bit - Apple Silicon, 1.4 GiB weights

At a glance

Source checkpoint: OpenMed/privacy-filter-multilingual-v2
OpenMed MLX repo: OpenMed/privacy-filter-multilingual-v2-mlx
Label schema: 54 fine-grained multilingual PII categories
Output space: 217 BIOES classes (O plus B/I/E/S for each category)
Languages: 16 languages from the source card: ar, bn, de, en, es, fr, hi, it, ja, ko, nl, pt, te, tr, vi, zh
Weight format: safetensors
Quantization: none (BF16 reference)

Q8 sibling validation

The 8-bit sibling was compared against this BF16 artifact on 10 golden PII samples. Decoded entity spans matched across all samples. Average Q8/BF16 argmax agreement was 100.00% with average logit MAE 0.1902; average local forward time was 15.1 ms for BF16 vs 8.4 ms for Q8.

What it does

This model is an MLX packaging of OpenMed/privacy-filter-multilingual-v2, the second-generation multilingual checkpoint for fine-grained PII extraction across 16 languages. It uses OpenAI's Privacy Filter architecture and predicts 217 BIOES classes (O plus B/I/E/S for each category). The OpenMed PrivacyFilterMLXPipeline runs BIOES-aware Viterbi decoding so callers receive grouped spans instead of raw token tags.

Label coverage highlights:

Identity: FIRSTNAME, MIDDLENAME, LASTNAME, AGE, GENDER, USERNAME, OCCUPATION, ORGANIZATION
Contact and address: EMAIL, PHONE, URL, STREET, BUILDINGNUMBER, CITY, COUNTY, STATE, ZIPCODE
Financial and crypto: BANKACCOUNT, IBAN, BIC, CREDITCARD, CVV, PIN, BITCOINADDRESS, ETHEREUMADDRESS
Vehicle, digital, and auth: VIN, VRM, IPADDRESS, MACADDRESS, IMEI, PASSWORD
Date and amount labels such as DATE, DATEOFBIRTH, TIME, AMOUNT, CURRENCY, and CURRENCYCODE

The full label map is included in id2label.json.

Architecture

Field	Value
Source model type	`openai_privacy_filter`
Source architecture	`OpenAIPrivacyFilterForTokenClassification`
Hidden size	640
Transformer layers	8
Attention	Grouped-query attention (14 query heads / 2 KV heads, head_dim=64) with attention sinks
FFN	Sparse Mixture-of-Experts - 128 experts, top-4 routing, SwiGLU
Position encoding	YARN-scaled RoPE (`rope_theta=150000`, factor=32)
Context length	131,072 tokens (initial 4,096)
Tokenizer	`o200k_base` / tiktoken-compatible tokenizer assets, vocab 200,064
Output head	Linear(640 -> 217) with bias

File set

File	Size	Purpose
`weights.safetensors`	2.6 GiB	MLX weights
`config.json`	17.6 KiB	Model and OpenMed MLX runtime config
`id2label.json`	4.8 KiB	Numeric ID to BIOES label mapping
`openmed-mlx.json`	0.7 KiB	OpenMed MLX artifact manifest
`tokenizer.json`	27 MiB	Tokenizer asset kept with the artifact
`tokenizer_config.json`	0.2 KiB	Tokenizer metadata

The MLX runtime uses the tiktoken-compatible o200k_base tokenizer path. tokenizer.json and tokenizer_config.json are bundled so consumers can inspect the tokenizer assets and keep the artifact self-contained.

Quick start

With OpenMed

pip install -U "openmed[mlx]"

from openmed import extract_pii, deidentify
from openmed.core import OpenMedConfig

model_name = "OpenMed/privacy-filter-multilingual-v2-mlx"
text = (
    "Patient Sarah Johnson (DOB 03/15/1985), MRN 4872910, "
    "phone 415-555-0123, email sarah.johnson@example.com."
)

result = extract_pii(
    text,
    model_name=model_name,
    config=OpenMedConfig(backend="mlx"),
)
for ent in result.entities:
    print(ent.label, ent.text, round(ent.confidence, 4))

masked = deidentify(
    text,
    method="mask",
    model_name=model_name,
    config=OpenMedConfig(backend="mlx"),
)
print(masked.deidentified_text)

For non-MLX hosts, use the source PyTorch checkpoint OpenMed/privacy-filter-multilingual-v2.

Direct MLX usage

from huggingface_hub import snapshot_download
from openmed.mlx.inference import PrivacyFilterMLXPipeline

model_path = snapshot_download("OpenMed/privacy-filter-multilingual-v2-mlx")
pipe = PrivacyFilterMLXPipeline(model_path)

print(pipe("Email me at alice.smith@example.com after 5pm."))

Loading from a local snapshot

from openmed.mlx.models import load_model
import mlx.core as mx

model = load_model("/path/to/privacy-filter-multilingual-v2-mlx")
ids = mx.array([[1, 100, 200, 300]], dtype=mx.int32)
mask = mx.ones((1, 4), dtype=mx.bool_)
logits = model(ids, attention_mask=mask)
print(logits.shape)

Hardware notes

Designed for Apple Silicon with MLX.
CPU inference may work, but GPU-backed MLX on M-series Macs is the intended runtime.
The Python package path is pip install -U "openmed[mlx]".

Credits

This artifact builds on:

OpenMed/privacy-filter-multilingual-v2 by OpenMed
openai/privacy-filter and OpenAI's opf training/evaluation tooling
The datasets listed in the model-card metadata above
Apple's MLX framework

License

Apache 2.0, matching the source checkpoint metadata.

Downloads last month: -

MLX

Hardware compatibility

Quantized

Model tree for OpenMed/privacy-filter-multilingual-v2-mlx

Base model

openai/privacy-filter

Finetuned

OpenMed/privacy-filter-multilingual-v2