Instructions to use thkim0305/RepBend_Mistral_7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use thkim0305/RepBend_Mistral_7B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="thkim0305/RepBend_Mistral_7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("thkim0305/RepBend_Mistral_7B")
model = AutoModelForCausalLM.from_pretrained("thkim0305/RepBend_Mistral_7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

vLLM

How to use thkim0305/RepBend_Mistral_7B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "thkim0305/RepBend_Mistral_7B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "thkim0305/RepBend_Mistral_7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/thkim0305/RepBend_Mistral_7B

SGLang

How to use thkim0305/RepBend_Mistral_7B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "thkim0305/RepBend_Mistral_7B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "thkim0305/RepBend_Mistral_7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "thkim0305/RepBend_Mistral_7B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "thkim0305/RepBend_Mistral_7B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use thkim0305/RepBend_Mistral_7B with Docker Model Runner:
```
docker model run hf.co/thkim0305/RepBend_Mistral_7B
```
Browse Quantizations to use this model in llama.cpp, Ollama, LM Studio, or any compatible app.

Model Description

This Mistral-based model is fine-tuned using the "Representation Bending" (REPBEND) approach described in Representation Bending for Large Language Model Safety. REPBEND modifies the model’s internal representations to reduce harmful or unsafe responses while preserving overall capabilities. The result is a model that is robust to various forms of adversarial jailbreak attacks, out-of-distribution harmful prompts, and fine-tuning exploits, all while maintaining useful and informative responses to benign requests.

Uses

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "thkim0305/RepBend_Mistral_7B"
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

input_text = "Who are you?"
template = "[INST] {instruction} [/INST] "

prompt = template.format(instruction=input_text)

input_ids = tokenizer.encode(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(input_ids, max_new_tokens=256)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(generated_text)

Code

Please refers to this github page

Citation

@article{repbend,
  title={Representation Bending for Large Language Model Safety},
  author={Yousefpour, Ashkan and Kim, Taeheon and Kwon, Ryan S and Lee, Seungbeen and Jeung, Wonje and Han, Seungju and Wan, Alvin and Ngan, Harrison and Yu, Youngjae and Choi, Jonghyun},
  journal={arXiv preprint arXiv:2504.01550},
  year={2025}
}

Downloads last month: 22

Safetensors

Model size

7B params

Tensor type

BF16

Model tree for thkim0305/RepBend_Mistral_7B

Quantizations

1 model

Paper for thkim0305/RepBend_Mistral_7B

Representation Bending for Large Language Model Safety

Paper • 2504.01550 • Published Apr 2, 2025 • 1