Zen3 Guard
Zen3 safety moderation model for multilingual content classification and filtering.
Overview
Zen Guard models provide multilingual content safety classification with three severity tiers: Safe, Controversial, and Unsafe — across 9 safety categories and 119 languages.
Developed by Hanzo AI and the Zoo Labs Foundation.
Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
import re
model_id = "zenlm/zen3-guard"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
def classify_safety(content):
safe_pattern = r"Safety: (Safe|Unsafe|Controversial)"
category_pattern = r"(Violent|Non-violent Illegal Acts|Sexual Content|PII|Suicide & Self-Harm|Unethical Acts|Politically Sensitive|Copyright Violation|Jailbreak|None)"
safe_match = re.search(safe_pattern, content)
label = safe_match.group(1) if safe_match else None
categories = re.findall(category_pattern, content)
return label, categories
messages = [{"role": "user", "content": "How do I learn programming?"}]
text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
result = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
label, categories = classify_safety(result)
print(f"Safety: {label}, Categories: {categories}")
Model Details
| Attribute | Value |
|---|---|
| Parameters | 8B |
| Architecture | Zen MoDE |
| Context | 32K tokens |
| Languages | 119 |
| License | Apache 2.0 |
License
Apache 2.0
- Downloads last month
- 54