Instructions to use 0arch-io/dolphin-v2-8b-abliterated with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use 0arch-io/dolphin-v2-8b-abliterated with llama-cpp-python:

# !pip install llama-cpp-python

from llama_cpp import Llama

llm = Llama.from_pretrained(
	repo_id="0arch-io/dolphin-v2-8b-abliterated",
	filename="dolphin-v2-8b-abliterated-Q4_K_M.gguf",
)

output = llm(
	"Once upon a time,",
	max_tokens=512,
	echo=True
)
print(output)

Inference
Notebooks
Google Colab
Kaggle
Local Apps Settings

llama.cpp

How to use 0arch-io/dolphin-v2-8b-abliterated with llama.cpp:

Install from brew

brew install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf 0arch-io/dolphin-v2-8b-abliterated:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf 0arch-io/dolphin-v2-8b-abliterated:Q4_K_M

Install from WinGet (Windows)

winget install llama.cpp
# Start a local OpenAI-compatible server with a web UI:
llama-server -hf 0arch-io/dolphin-v2-8b-abliterated:Q4_K_M
# Run inference directly in the terminal:
llama-cli -hf 0arch-io/dolphin-v2-8b-abliterated:Q4_K_M

Use pre-built binary

# Download pre-built binary from:
# https://github.com/ggerganov/llama.cpp/releases
# Start a local OpenAI-compatible server with a web UI:
./llama-server -hf 0arch-io/dolphin-v2-8b-abliterated:Q4_K_M
# Run inference directly in the terminal:
./llama-cli -hf 0arch-io/dolphin-v2-8b-abliterated:Q4_K_M

Build from source code

git clone https://github.com/ggerganov/llama.cpp.git
cd llama.cpp
cmake -B build
cmake --build build -j --target llama-server llama-cli
# Start a local OpenAI-compatible server with a web UI:
./build/bin/llama-server -hf 0arch-io/dolphin-v2-8b-abliterated:Q4_K_M
# Run inference directly in the terminal:
./build/bin/llama-cli -hf 0arch-io/dolphin-v2-8b-abliterated:Q4_K_M

Use Docker

docker model run hf.co/0arch-io/dolphin-v2-8b-abliterated:Q4_K_M

LM Studio
Jan

vLLM

How to use 0arch-io/dolphin-v2-8b-abliterated with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "0arch-io/dolphin-v2-8b-abliterated"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "0arch-io/dolphin-v2-8b-abliterated",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/0arch-io/dolphin-v2-8b-abliterated:Q4_K_M

Ollama
How to use 0arch-io/dolphin-v2-8b-abliterated with Ollama:
```
ollama run hf.co/0arch-io/dolphin-v2-8b-abliterated:Q4_K_M
```

Unsloth Studio

How to use 0arch-io/dolphin-v2-8b-abliterated with Unsloth Studio:

Install Unsloth Studio (macOS, Linux, WSL)

curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for 0arch-io/dolphin-v2-8b-abliterated to start chatting

Install Unsloth Studio (Windows)

irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for 0arch-io/dolphin-v2-8b-abliterated to start chatting

Using HuggingFace Spaces for Unsloth

# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for 0arch-io/dolphin-v2-8b-abliterated to start chatting

Docker Model Runner
How to use 0arch-io/dolphin-v2-8b-abliterated with Docker Model Runner:
```
docker model run hf.co/0arch-io/dolphin-v2-8b-abliterated:Q4_K_M
```

Lemonade

How to use 0arch-io/dolphin-v2-8b-abliterated with Lemonade:

Pull the model

# Download Lemonade from https://lemonade-server.ai/
lemonade pull 0arch-io/dolphin-v2-8b-abliterated:Q4_K_M

Run and chat with the model

lemonade run user.dolphin-v2-8b-abliterated-Q4_K_M

List all available models

lemonade list

Dolphin V2 8B Abliterated

An uncensored 8B parameter language model built on Qwen3-8B, fine-tuned on 1.35M high-quality instruction samples and abliterated to remove refusal behavior. Developed for TRC (TPU Research Cloud) research.

Model Details

Architecture: Qwen3ForCausalLM (36 layers, 4096 hidden, 32 attn heads, 8 KV heads)
Parameters: 8.2B
Context Length: 4096 (trained), 40960 (max supported)
Precision: bfloat16
License: Apache 2.0

Training

SFT Phase

Base model: Qwen/Qwen3-8B
Hardware: Google Cloud TPU v6e-16 (spot)
Framework: MaxText (JAX)
Steps: 130,000 (~3 epochs)
Learning rate: 5e-6 with cosine decay
Warmup: 200 steps
Effective batch size: 16
Sequence length: 4096

Training Dataset (1.35M samples)

Dataset	Samples	Purpose
NousResearch/Hermes-3-Dataset	~959K	Core uncensored assistant behavior
allenai/tulu-3-sft-mixture	~200K	Diverse instruction following
HuggingFaceTB/smoltalk (magpie-ultra)	~100K	High quality diverse tasks
HuggingFaceTB/smoltalk (numina-cot)	~50K	Math reasoning
HuggingFaceTB/smoltalk (self-oss-instruct)	~50K	Code generation
LDJnr/Capybara	~16K	Multi-turn conversations

All data was filtered to remove refusal patterns, safety-alignment subsets, and <think> reasoning tags.

Abliteration Phase

After SFT, the model was abliterated using the weight orthogonalization technique from Arditi et al. (2024) to remove residual refusal behavior.

Technique: Multi-direction abliteration (weight orthogonalization)
Directions removed: 5
Target layers: 35, 34, 36, 33, 16 (selected by highest refusal direction scores)
Samples used: 256 harmful/harmless instruction pairs
Method: For each selected layer, the refusal direction was identified via mean difference between harmful and harmless activations, then orthogonalized out of the weight matrices.

Benchmark Results

Evaluated using lm-evaluation-harness with 200 samples per task, 5-shot (except TruthfulQA which is 0-shot).

Benchmark	Metric	Score
ARC-Challenge	acc	56.5%
ARC-Challenge	acc_norm	54.0%
HellaSwag	acc_norm	64.5%
TruthfulQA MC2	acc	48.8%
Winogrande	acc	57.0%

GGUF Quantizations

File	Quant	Size	Description
`dolphin-v2-8b-abliterated-Q8_0.gguf`	Q8_0	8.3 GB	Best quality quantization
`dolphin-v2-8b-abliterated-Q4_K_M.gguf`	Q4_K_M	4.8 GB	Good balance of quality and size

Usage with llama.cpp

llama-server -m dolphin-v2-8b-abliterated-Q8_0.gguf -ngl 99 -c 4096

Usage with Ollama

# Create a Modelfile
echo 'FROM ./dolphin-v2-8b-abliterated-Q8_0.gguf' > Modelfile
ollama create dolphin-v2-abliterated -f Modelfile
ollama run dolphin-v2-abliterated

Usage with Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("0arch-io/dolphin-v2-8b-abliterated", torch_dtype="bfloat16", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("0arch-io/dolphin-v2-8b-abliterated")

messages = [{"role": "user", "content": "Hello, how are you?"}]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7)
print(tokenizer.decode(outputs[0][inputs.shape[-1]:], skip_special_tokens=True))

Disclaimer

This is a research model with no content filters. It will comply with any request without refusing. The creators are not responsible for how this model is used. Use responsibly.

Acknowledgments

Qwen team for the Qwen3-8B base model
Google TRC for TPU compute
NousResearch for the Hermes-3 dataset
Arditi et al. for the abliteration technique
Built with MaxText on Google Cloud TPU

Downloads last month: 119

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for 0arch-io/dolphin-v2-8b-abliterated

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Quantized

(292)

this model

Quantizations

2 models

Paper for 0arch-io/dolphin-v2-8b-abliterated

Refusal in Language Models Is Mediated by a Single Direction

Paper • 2406.11717 • Published Jun 17, 2024 • 13

Evaluation results

Accuracy on ARC Challenge
test set self-reported

56.500
Normalized Accuracy on ARC Challenge
test set self-reported

54.000
Normalized Accuracy on HellaSwag
validation set self-reported

64.500
Accuracy on TruthfulQA
validation set self-reported

48.800
Accuracy on Winogrande
validation set self-reported

57.000