Instructions to use Alindstroem89/Llama-3.2-3B-Instruct_guardrail with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use Alindstroem89/Llama-3.2-3B-Instruct_guardrail with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="Alindstroem89/Llama-3.2-3B-Instruct_guardrail",
	filename="Llama-3.2-3B-Instruct.F16.gguf",
)

llm.create_chat_completion(
	messages = [
		{
			"role": "user",
			"content": "What is the capital of France?"
		}
	]
)

Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use Alindstroem89/Llama-3.2-3B-Instruct_guardrail with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M

Use Docker

docker model run hf.co/Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M

LM Studio
Jan

vLLM

How to use Alindstroem89/Llama-3.2-3B-Instruct_guardrail with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "Alindstroem89/Llama-3.2-3B-Instruct_guardrail"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "Alindstroem89/Llama-3.2-3B-Instruct_guardrail",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M

Ollama
How to use Alindstroem89/Llama-3.2-3B-Instruct_guardrail with Ollama:
```
ollama run hf.co/Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M
```

Unsloth Studio

How to use Alindstroem89/Llama-3.2-3B-Instruct_guardrail with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Alindstroem89/Llama-3.2-3B-Instruct_guardrail to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for Alindstroem89/Llama-3.2-3B-Instruct_guardrail to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for Alindstroem89/Llama-3.2-3B-Instruct_guardrail to start chatting

How to use Alindstroem89/Llama-3.2-3B-Instruct_guardrail with Pi:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M

Configure the model in Pi

# Install Pi:
npm install -g @mariozechner/pi-coding-agent
# Add to ~/.pi/agent/models.json:
{
  "providers": {
    "llama-cpp": {
      "baseUrl": "http://localhost:8080/v1",
      "api": "openai-completions",
      "apiKey": "none",
      "models": [
        {
          "id": "Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M"
        }
      ]
    }
  }
}

Run Pi

# Start Pi in your project directory:
pi

Hermes Agent new

How to use Alindstroem89/Llama-3.2-3B-Instruct_guardrail with Hermes Agent:

Start the llama.cpp server

# Install llama.cpp:
brew install llama.cpp
# Start a local OpenAI-compatible server:
llama-server -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M

Configure Hermes

# Install Hermes:
curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash
hermes setup
# Point Hermes at the local server:
hermes config set model.provider custom
hermes config set model.base_url http://127.0.0.1:8080/v1
hermes config set model.default Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M

Run Hermes

hermes

Atomic Chat new
Docker Model Runner
How to use Alindstroem89/Llama-3.2-3B-Instruct_guardrail with Docker Model Runner:
```
docker model run hf.co/Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M
```

Lemonade

How to use Alindstroem89/Llama-3.2-3B-Instruct_guardrail with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull Alindstroem89/Llama-3.2-3B-Instruct_guardrail:Q4_K_M

Run and chat with the model

lemonade run user.Llama-3.2-3B-Instruct_guardrail-Q4_K_M

List all available models

lemonade list

Llama-3.2-3B-Instruct_guardrail : GGUF

A fine-tuned Llama 3.2 model trained to resist prompt injection attacks. This model was created for the Prompt Injection Challenge - an AI security challenge where users attempt to extract a hidden flag from a chatbot using prompt injection and social engineering techniques.

This model was fine-tuned and converted to GGUF format using Unsloth.

Model Description

Fine-tuned to:

Recognize and resist prompt injection techniques
Maintain boundaries and refuse to reveal protected information
Remain helpful and friendly for legitimate conversations
Politely explain refusals without being unnecessarily rigid

Training Details

Base Model: unsloth/Llama-3.2-3B-Instruct

Training Configuration:

LoRA Rank (r): 32
LoRA Alpha: 32
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Use RSLoRA: True
Optimizer: adamw_8bit
Learning Rate: 1e-4
Batch Size: 2 per device
Gradient Accumulation: 8 steps
Epochs: 1
Max Sequence Length: 8192

Dataset: Custom dataset with guardrail conversations (prompt injection attempts with refusals) and normal helpful conversations.

Usage

With llama-cli

llama-cli -hf Alindstroem89/Llama-3.2-3B-Instruct_guardrail:F16 --jinja

Download with Hugging Face CLI

# Download all GGUF files
hf download Alindstroem89/Llama-3.2-3B-Instruct_guardrail --include "*.gguf" --local-dir ./models

# Download specific quantization
hf download Alindstroem89/Llama-3.2-3B-Instruct_guardrail --include "Llama-3.2-3B-Instruct.Q4_K_M.gguf" --local-dir ./models

Ollama

An Ollama Modelfile is included for easy deployment.

Available Model Files

Llama-3.2-3B-Instruct.Q3_K_M.gguf
Llama-3.2-3B-Instruct.Q4_K_M.gguf
Llama-3.2-3B-Instruct.F16.gguf
Llama-3.2-3B-Instruct.BF16.gguf

Use Cases

Chatbots requiring prompt injection resistance
AI assistants handling sensitive information
AI security research and education
Testing guardrail implementations

Limitations

Primarily tested on English language
Not a comprehensive security solution
May occasionally be overly cautious
Should not be the sole defense mechanism in production

Training Infrastructure

Framework: Unsloth (2x faster training)
Method: LoRA (Low-Rank Adaptation) with rank-stabilized optimization
Conversion: GGUF format for efficient inference

Finetuning repo

Guardrail_finetuning

License

This model follows the license of the base Llama 3.2 model.

Downloads last month: 164

GGUF

Model size

3B params

Architecture

llama

Hardware compatibility

3-bit

4-bit

16-bit

Model tree for Alindstroem89/Llama-3.2-3B-Instruct_guardrail

Base model

meta-llama/Llama-3.2-3B-Instruct

Finetuned

unsloth/Llama-3.2-3B-Instruct

Quantized

(111)

this model

Alindstroem89
/

Llama-3.2-3B-Instruct_guardrail