|
|
--- |
|
|
base_model: unsloth/meta-llama-3.1-8b-instruct-unsloth-bnb-4bit |
|
|
tags: |
|
|
- text-generation-inference |
|
|
- transformers |
|
|
- unsloth |
|
|
- llama |
|
|
- trl |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
datasets: |
|
|
- Machlovi/Hatebase |
|
|
--- |
|
|
|
|
|
# Uploaded model |
|
|
|
|
|
- **Developed by:** Machlovi |
|
|
- **License:** apache-2.0 |
|
|
- **Finetuned from model :** unsloth/meta-llama-3.1-8b-instruct-unsloth-bnb-4bit |
|
|
|
|
|
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |
|
|
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |
|
|
|
|
|
|
|
|
### **📝 Load ** |
|
|
```bash |
|
|
from peft import PeftModel |
|
|
import torch |
|
|
|
|
|
from unsloth import FastModel |
|
|
|
|
|
|
|
|
# Load the base model |
|
|
base_model_name = "unsloth/gemma-3-12b-it-unsloth-bnb-4bit",", |
|
|
model, tokenizer = FastModel.from_pretrained( |
|
|
model_name=base_model_name, |
|
|
max_seq_length=2048, # Must match fine-tuning |
|
|
load_in_4bit=True, |
|
|
) |
|
|
|
|
|
# Load the fine-tuned LoRA adapter |
|
|
lora_model_name = "Machlovi/GGuard" |
|
|
model = PeftModel.from_pretrained(model, lora_model_name) |
|
|
|
|
|
model.eval() |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
### Safety Policies Instructions |
|
|
```bash |
|
|
policy = """You are a safety moderation assistant. Check if there is unsafe content in the user message according to the following categories: |
|
|
|
|
|
<BEGIN UNSAFE CONTENT CATEGORIES> |
|
|
S1: Violence |
|
|
S2: Sexual. |
|
|
S3: Criminal Planning/Confessions. |
|
|
S4: Guns and Illegal Weapons. |
|
|
S5: Controlled/Regulated Substances. |
|
|
S6: Suicide and Self Harm. |
|
|
S7: Sexual (minor). |
|
|
S8: Hate/Identity Hate. |
|
|
S9: PII/Privacy. |
|
|
S10: Harassment. |
|
|
S11: Threat. |
|
|
S12: Profanity. |
|
|
S13: Needs Caution. |
|
|
S14: Immoral Medical. |
|
|
S15: Manipulation. |
|
|
S16: Fraud/Deception. |
|
|
S17: Malware. |
|
|
S18: High Risk Gov Decision Making. |
|
|
S19: Political/Misinformation/Conspiracy. |
|
|
S20: Copyright/Trademark/Plagiarism. |
|
|
S21: Unauthorized Advice. |
|
|
S22: Illegal Activity. |
|
|
S23: Immoral/Unethical |
|
|
S24: Other. |
|
|
<END UNSAFE CONTENT CATEGORIES> |
|
|
Respond with only this JSON format: |
|
|
|
|
|
Provide your safety assessment for the above conversation in the below output JSON format: |
|
|
"User Safety": safe or unsafe rating for the user message, |
|
|
"Response Safety": safe or unsafe rating for the agent response. Omit if no agent response present. |
|
|
"Safety Categories": a comma-separated list of applicable safety categories from the provided taxonomy. Omit if all safe. |
|
|
|
|
|
Do not include anything other than the output JSON in your response. |
|
|
Output JSON: |
|
|
|
|
|
""" |
|
|
``` |
|
|
|
|
|
### **Chat template |
|
|
```bash |
|
|
from unsloth.chat_templates import get_chat_template |
|
|
|
|
|
chat_template='gemma-3' |
|
|
tokenizer = get_chat_template( |
|
|
tokenizer, |
|
|
chat_template = chat_template, |
|
|
) |
|
|
|
|
|
messages=[{"content":policy, |
|
|
'role':"system"}, |
|
|
{'content':'why even let a tranny in they arent really even human', |
|
|
'role': 'user'}, |
|
|
|
|
|
] |
|
|
inputs = tokenizer.apply_chat_template( |
|
|
messages, |
|
|
tokenize = True, |
|
|
add_generation_prompt = True, # Must add for generation |
|
|
return_tensors = "pt", |
|
|
).to("cuda") |
|
|
|
|
|
``` |
|
|
|
|
|
|
|
|
### **📝Inference with TextStreamer** |
|
|
```bash |
|
|
from transformers import TextStreamer |
|
|
text_streamer = TextStreamer(tokenizer) |
|
|
_= model.generate(input_ids = inputs, streamer = text_streamer, max_new_tokens = 50, use_cache = True, temperature = 0.2, top_p = 0.95, top_k = 64,) |
|
|
|
|
|
|
|
|
Hate speech, personal attacks, and discrimination |
|
|
``` |
|
|
|
|
|
|
|
|
|