Update README.md
Browse files
README.md
CHANGED
|
@@ -51,14 +51,10 @@ model.eval()
|
|
| 51 |
|
| 52 |
### Safety Policies Instructions
|
| 53 |
```bash
|
| 54 |
-
|
| 55 |
-
below conversation in accordance with our safety policy with the below categories.
|
| 56 |
-
Provide a safe or unsafe 'rating' for both the user message and the agent response.
|
| 57 |
-
If there is unsafe content, please also provide a list of violated categories according to our safety
|
| 58 |
-
policy below
|
| 59 |
|
| 60 |
<BEGIN UNSAFE CONTENT CATEGORIES>
|
| 61 |
-
S1: Violence
|
| 62 |
S2: Sexual.
|
| 63 |
S3: Criminal Planning/Confessions.
|
| 64 |
S4: Guns and Illegal Weapons.
|
|
@@ -71,7 +67,7 @@ S10: Harassment.
|
|
| 71 |
S11: Threat.
|
| 72 |
S12: Profanity.
|
| 73 |
S13: Needs Caution.
|
| 74 |
-
S14:
|
| 75 |
S15: Manipulation.
|
| 76 |
S16: Fraud/Deception.
|
| 77 |
S17: Malware.
|
|
@@ -83,14 +79,7 @@ S22: Illegal Activity.
|
|
| 83 |
S23: Immoral/Unethical
|
| 84 |
S24: Other.
|
| 85 |
<END UNSAFE CONTENT CATEGORIES>
|
| 86 |
-
|
| 87 |
-
Provide your safety assessment for the
|
| 88 |
-
below conversation in the specified JSON format.
|
| 89 |
-
|
| 90 |
-
<BEGIN CONVERSATION>
|
| 91 |
-
user: {user}
|
| 92 |
-
response: {agent}
|
| 93 |
-
<END CONVERSATION>
|
| 94 |
|
| 95 |
Provide your safety assessment for the above conversation in the below output JSON format:
|
| 96 |
"User Safety": safe or unsafe rating for the user message,
|
|
@@ -99,6 +88,7 @@ Provide your safety assessment for the above conversation in the below output JS
|
|
| 99 |
|
| 100 |
Do not include anything other than the output JSON in your response.
|
| 101 |
Output JSON:
|
|
|
|
| 102 |
"""
|
| 103 |
```
|
| 104 |
|
|
@@ -106,7 +96,7 @@ Output JSON:
|
|
| 106 |
```bash
|
| 107 |
from unsloth.chat_templates import get_chat_template
|
| 108 |
|
| 109 |
-
chat_template='
|
| 110 |
tokenizer = get_chat_template(
|
| 111 |
tokenizer,
|
| 112 |
chat_template = chat_template,
|
|
@@ -132,7 +122,7 @@ inputs = tokenizer.apply_chat_template(
|
|
| 132 |
```bash
|
| 133 |
from transformers import TextStreamer
|
| 134 |
text_streamer = TextStreamer(tokenizer)
|
| 135 |
-
_= model.generate(input_ids = inputs, streamer = text_streamer, max_new_tokens =
|
| 136 |
|
| 137 |
|
| 138 |
Hate speech, personal attacks, and discrimination
|
|
|
|
| 51 |
|
| 52 |
### Safety Policies Instructions
|
| 53 |
```bash
|
| 54 |
+
policy = """You are a safety moderation assistant. Check if there is unsafe content in the user message according to the following categories:
|
|
|
|
|
|
|
|
|
|
|
|
|
| 55 |
|
| 56 |
<BEGIN UNSAFE CONTENT CATEGORIES>
|
| 57 |
+
S1: Violence
|
| 58 |
S2: Sexual.
|
| 59 |
S3: Criminal Planning/Confessions.
|
| 60 |
S4: Guns and Illegal Weapons.
|
|
|
|
| 67 |
S11: Threat.
|
| 68 |
S12: Profanity.
|
| 69 |
S13: Needs Caution.
|
| 70 |
+
S14: Immoral Medical.
|
| 71 |
S15: Manipulation.
|
| 72 |
S16: Fraud/Deception.
|
| 73 |
S17: Malware.
|
|
|
|
| 79 |
S23: Immoral/Unethical
|
| 80 |
S24: Other.
|
| 81 |
<END UNSAFE CONTENT CATEGORIES>
|
| 82 |
+
Respond with only this JSON format:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 83 |
|
| 84 |
Provide your safety assessment for the above conversation in the below output JSON format:
|
| 85 |
"User Safety": safe or unsafe rating for the user message,
|
|
|
|
| 88 |
|
| 89 |
Do not include anything other than the output JSON in your response.
|
| 90 |
Output JSON:
|
| 91 |
+
|
| 92 |
"""
|
| 93 |
```
|
| 94 |
|
|
|
|
| 96 |
```bash
|
| 97 |
from unsloth.chat_templates import get_chat_template
|
| 98 |
|
| 99 |
+
chat_template='gemma-3'
|
| 100 |
tokenizer = get_chat_template(
|
| 101 |
tokenizer,
|
| 102 |
chat_template = chat_template,
|
|
|
|
| 122 |
```bash
|
| 123 |
from transformers import TextStreamer
|
| 124 |
text_streamer = TextStreamer(tokenizer)
|
| 125 |
+
_= model.generate(input_ids = inputs, streamer = text_streamer, max_new_tokens = 50, use_cache = True, temperature = 0.2, top_p = 0.95, top_k = 64,)
|
| 126 |
|
| 127 |
|
| 128 |
Hate speech, personal attacks, and discrimination
|