OpenAI Privacy Filter โ Rust/Burn Weights
Safetensors weights for openai/privacy-filter, packaged for inference with privacy-filter-rs (pure-Rust, Burn ML framework).
Contents
| File | Size | Description |
|---|---|---|
model.safetensors |
2.6 GB | Model weights (bfloat16) |
config.json |
3 KB | HuggingFace model configuration |
tokenizer.json |
27 MB | BPE tokenizer (o200k_base) |
tokenizer_config.json |
234 B | Tokenizer metadata |
viterbi_calibration.json |
372 B | Viterbi decoder operating points |
Model Details
- Architecture: Bidirectional transformer encoder with Sparse MoE
- Parameters: 1.5B total, ~50M active per token (top-4 of 128 experts)
- Hidden size: 640, Layers: 8, Heads: 14 Q / 2 KV (GQA)
- Context: 128,000 tokens (YaRN RoPE, sliding window 257)
- Output: 33 BIOES token classes over 8 privacy categories
- Dtype: bfloat16 (converted to f32 at load time by the Rust runtime)
Privacy Categories
account_numberprivate_addressprivate_dateprivate_emailprivate_personprivate_phoneprivate_urlsecret
Usage with privacy-filter-rs
# Clone the Rust project
git clone https://github.com/eugenehp/privacy-filter-rs
cd privacy-filter-rs
# Download weights into ./data (this repo)
# git clone https://huggingface.co/eugenehp/privacy-filter-rs data
# Run inference
cargo run --release -- -m data "My name is Alice Smith"
use privacy_filter_rs::{PrivacyFilterInference, backend::{B, Device}};
use std::path::Path;
let device = <Device as Default>::default();
let engine = PrivacyFilterInference::<B>::load(Path::new("data"), device)?;
let spans = engine.predict("My name is Alice Smith")?;
for s in &spans {
println!("{}: {} (score: {:.4})", s.entity_group, s.word, s.score);
}
// private_person: Alice Smith (score: 1.0000)
License
Apache 2.0 โ same as the upstream openai/privacy-filter model.
- Downloads last month
- 22
Model tree for eugenehp/privacy-filter-rs
Base model
openai/privacy-filter