zeekay commited on
Commit
0cf5efc
·
verified ·
1 Parent(s): cd7d422

Update model card: add zen/zenlm tags, fix branding

Browse files
Files changed (1) hide show
  1. README.md +42 -72
README.md CHANGED
@@ -1,97 +1,67 @@
1
  ---
 
2
  license: apache-2.0
3
- language:
4
- - en
5
- pipeline_tag: text-classification
6
  tags:
7
- - safety
8
- - content-moderation
9
- - guard
10
- - zen
11
- - hanzo
 
 
 
 
12
  library_name: transformers
13
  ---
14
 
15
  # Zen3 Guard
16
 
17
- **Zen3 Guard** is a safety and content moderation model developed by [Hanzo AI](https://hanzo.ai) as part of the Zen model family. It classifies inputs across multiple risk categories to ensure safe AI interactions.
18
 
19
  ## Overview
20
 
21
- Zen3 Guard is a 5B parameter model designed for real-time content safety classification. It provides fine-grained risk assessment across multiple harm categories, making it ideal for building responsible AI systems.
22
-
23
- ### Key Features
24
-
25
- - **Multi-category risk detection**: Covers harmful content, social bias, profanity, violence, sexual content, and more
26
- - **Low latency**: Optimized for real-time safety screening in production pipelines
27
- - **High accuracy**: Strong performance across safety benchmarks
28
- - **Flexible deployment**: Works as a standalone classifier or integrated safety layer
29
 
30
- ## Risk Categories
31
 
32
- | Category | Description |
33
- |----------|-------------|
34
- | Harm | Content promoting self-harm or harm to others |
35
- | Social Bias | Discriminatory or biased content |
36
- | Jailbreaking | Attempts to bypass safety guidelines |
37
- | Violence | Graphic or promoting violence |
38
- | Profanity | Obscene or vulgar language |
39
- | Sexual Content | Explicit or suggestive material |
40
- | Unethical Behavior | Content promoting illegal or unethical actions |
41
- | Groundedness | Factual accuracy and hallucination detection |
42
-
43
- ## Usage
44
 
45
  ```python
46
- from transformers import AutoTokenizer, AutoModelForCausalLM
 
47
 
48
  model_id = "zenlm/zen3-guard"
49
  tokenizer = AutoTokenizer.from_pretrained(model_id)
50
- model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")
51
-
52
- # Format input for safety classification
53
- prompt = "Is this content safe? [content to check]"
54
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
55
- outputs = model.generate(**inputs, max_new_tokens=32)
56
- result = tokenizer.decode(outputs[0], skip_special_tokens=True)
57
- print(result)
 
 
 
 
 
 
 
 
 
58
  ```
59
 
60
  ## Model Details
61
 
62
- | Property | Value |
63
- |----------|-------|
64
- | Parameters | 5B |
65
- | Architecture | Transformer (decoder-only) |
66
- | Precision | bfloat16 |
 
67
  | License | Apache 2.0 |
68
- | Context Length | 8192 tokens |
69
-
70
- ## Intended Use
71
-
72
- - Content moderation pipelines
73
- - Safety screening for LLM outputs
74
- - Input validation for AI applications
75
- - Compliance and policy enforcement
76
-
77
- ## Limitations
78
-
79
- - Optimized for English text; multilingual performance may vary
80
- - Should be used as one layer in a comprehensive safety system
81
- - Risk thresholds should be calibrated for specific use cases
82
-
83
- ## Citation
84
-
85
- ```bibtex
86
- @misc{zen3-guard,
87
- title={Zen3 Guard: Content Safety Classification Model},
88
- author={Hanzo AI},
89
- year={2025},
90
- url={https://huggingface.co/zenlm/zen3-guard}
91
- }
92
- ```
93
 
94
- ## Links
95
 
96
- - [Zen Model Family](https://huggingface.co/zenlm)
97
- - [Hanzo AI](https://hanzo.ai)
 
1
  ---
2
+ language: en
3
  license: apache-2.0
 
 
 
4
  tags:
5
+ - text-classification
6
+ - zen
7
+ - zenlm
8
+ - hanzo
9
+ - zen3
10
+ - safety
11
+ - moderation
12
+ - content-classification
13
+ pipeline_tag: text-classification
14
  library_name: transformers
15
  ---
16
 
17
  # Zen3 Guard
18
 
19
+ Zen3 safety moderation model for multilingual content classification and filtering.
20
 
21
  ## Overview
22
 
23
+ Zen Guard models provide multilingual content safety classification with three severity tiers:
24
+ **Safe**, **Controversial**, and **Unsafe** — across 9 safety categories and 119 languages.
 
 
 
 
 
 
25
 
26
+ Developed by [Hanzo AI](https://hanzo.ai) and the [Zoo Labs Foundation](https://zoo.ngo).
27
 
28
+ ## Quick Start
 
 
 
 
 
 
 
 
 
 
 
29
 
30
  ```python
31
+ from transformers import AutoModelForCausalLM, AutoTokenizer
32
+ import re
33
 
34
  model_id = "zenlm/zen3-guard"
35
  tokenizer = AutoTokenizer.from_pretrained(model_id)
36
+ model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
37
+
38
+ def classify_safety(content):
39
+ safe_pattern = r"Safety: (Safe|Unsafe|Controversial)"
40
+ category_pattern = r"(Violent|Non-violent Illegal Acts|Sexual Content|PII|Suicide & Self-Harm|Unethical Acts|Politically Sensitive|Copyright Violation|Jailbreak|None)"
41
+ safe_match = re.search(safe_pattern, content)
42
+ label = safe_match.group(1) if safe_match else None
43
+ categories = re.findall(category_pattern, content)
44
+ return label, categories
45
+
46
+ messages = [{"role": "user", "content": "How do I learn programming?"}]
47
+ text = tokenizer.apply_chat_template(messages, tokenize=False)
48
+ inputs = tokenizer([text], return_tensors="pt").to(model.device)
49
+ outputs = model.generate(**inputs, max_new_tokens=128)
50
+ result = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
51
+ label, categories = classify_safety(result)
52
+ print(f"Safety: {label}, Categories: {categories}")
53
  ```
54
 
55
  ## Model Details
56
 
57
+ | Attribute | Value |
58
+ |-----------|-------|
59
+ | Parameters | 8B |
60
+ | Architecture | Zen MoDE |
61
+ | Context | 32K tokens |
62
+ | Languages | 119 |
63
  | License | Apache 2.0 |
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
 
65
+ ## License
66
 
67
+ Apache 2.0