OpenMed Privacy Filter Multilingual v2 - MLX BF16

A native MLX port of OpenMed/privacy-filter-multilingual-v2 for Apple Silicon PII detection and de-identification with OpenMed. This is the unquantized BF16 reference artifact. For the 8-bit sibling, see OpenMed/privacy-filter-multilingual-v2-mlx-8bit.

Family at a glance:

At a glance

  • Source checkpoint: OpenMed/privacy-filter-multilingual-v2
  • OpenMed MLX repo: OpenMed/privacy-filter-multilingual-v2-mlx
  • Label schema: 54 fine-grained multilingual PII categories
  • Output space: 217 BIOES classes (O plus B/I/E/S for each category)
  • Languages: 16 languages from the source card: ar, bn, de, en, es, fr, hi, it, ja, ko, nl, pt, te, tr, vi, zh
  • Weight format: safetensors
  • Quantization: none (BF16 reference)

Q8 sibling validation

The 8-bit sibling was compared against this BF16 artifact on 10 golden PII samples. Decoded entity spans matched across all samples. Average Q8/BF16 argmax agreement was 100.00% with average logit MAE 0.1902; average local forward time was 15.1 ms for BF16 vs 8.4 ms for Q8.

What it does

This model is an MLX packaging of OpenMed/privacy-filter-multilingual-v2, the second-generation multilingual checkpoint for fine-grained PII extraction across 16 languages. It uses OpenAI's Privacy Filter architecture and predicts 217 BIOES classes (O plus B/I/E/S for each category). The OpenMed PrivacyFilterMLXPipeline runs BIOES-aware Viterbi decoding so callers receive grouped spans instead of raw token tags.

Label coverage highlights:

  • Identity: FIRSTNAME, MIDDLENAME, LASTNAME, AGE, GENDER, USERNAME, OCCUPATION, ORGANIZATION
  • Contact and address: EMAIL, PHONE, URL, STREET, BUILDINGNUMBER, CITY, COUNTY, STATE, ZIPCODE
  • Financial and crypto: BANKACCOUNT, IBAN, BIC, CREDITCARD, CVV, PIN, BITCOINADDRESS, ETHEREUMADDRESS
  • Vehicle, digital, and auth: VIN, VRM, IPADDRESS, MACADDRESS, IMEI, PASSWORD
  • Date and amount labels such as DATE, DATEOFBIRTH, TIME, AMOUNT, CURRENCY, and CURRENCYCODE

The full label map is included in id2label.json.

Architecture

Field Value
Source model type openai_privacy_filter
Source architecture OpenAIPrivacyFilterForTokenClassification
Hidden size 640
Transformer layers 8
Attention Grouped-query attention (14 query heads / 2 KV heads, head_dim=64) with attention sinks
FFN Sparse Mixture-of-Experts - 128 experts, top-4 routing, SwiGLU
Position encoding YARN-scaled RoPE (rope_theta=150000, factor=32)
Context length 131,072 tokens (initial 4,096)
Tokenizer o200k_base / tiktoken-compatible tokenizer assets, vocab 200,064
Output head Linear(640 -> 217) with bias

File set

File Size Purpose
weights.safetensors 2.6 GiB MLX weights
config.json 17.6 KiB Model and OpenMed MLX runtime config
id2label.json 4.8 KiB Numeric ID to BIOES label mapping
openmed-mlx.json 0.7 KiB OpenMed MLX artifact manifest
tokenizer.json 27 MiB Tokenizer asset kept with the artifact
tokenizer_config.json 0.2 KiB Tokenizer metadata

The MLX runtime uses the tiktoken-compatible o200k_base tokenizer path. tokenizer.json and tokenizer_config.json are bundled so consumers can inspect the tokenizer assets and keep the artifact self-contained.

Quick start

With OpenMed

pip install -U "openmed[mlx]"
from openmed import extract_pii, deidentify
from openmed.core import OpenMedConfig

model_name = "OpenMed/privacy-filter-multilingual-v2-mlx"
text = (
    "Patient Sarah Johnson (DOB 03/15/1985), MRN 4872910, "
    "phone 415-555-0123, email sarah.johnson@example.com."
)

result = extract_pii(
    text,
    model_name=model_name,
    config=OpenMedConfig(backend="mlx"),
)
for ent in result.entities:
    print(ent.label, ent.text, round(ent.confidence, 4))

masked = deidentify(
    text,
    method="mask",
    model_name=model_name,
    config=OpenMedConfig(backend="mlx"),
)
print(masked.deidentified_text)

For non-MLX hosts, use the source PyTorch checkpoint OpenMed/privacy-filter-multilingual-v2.

Direct MLX usage

from huggingface_hub import snapshot_download
from openmed.mlx.inference import PrivacyFilterMLXPipeline

model_path = snapshot_download("OpenMed/privacy-filter-multilingual-v2-mlx")
pipe = PrivacyFilterMLXPipeline(model_path)

print(pipe("Email me at alice.smith@example.com after 5pm."))

Loading from a local snapshot

from openmed.mlx.models import load_model
import mlx.core as mx

model = load_model("/path/to/privacy-filter-multilingual-v2-mlx")
ids = mx.array([[1, 100, 200, 300]], dtype=mx.int32)
mask = mx.ones((1, 4), dtype=mx.bool_)
logits = model(ids, attention_mask=mask)
print(logits.shape)

Hardware notes

  • Designed for Apple Silicon with MLX.
  • CPU inference may work, but GPU-backed MLX on M-series Macs is the intended runtime.
  • The Python package path is pip install -U "openmed[mlx]".

Credits

This artifact builds on:

License

Apache 2.0, matching the source checkpoint metadata.

Downloads last month
-
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for OpenMed/privacy-filter-multilingual-v2-mlx

Finetuned
(2)
this model

Datasets used to train OpenMed/privacy-filter-multilingual-v2-mlx