Qwen3.5-27B-Instruct Uncensored
An uncensored version of Qwen3.5-27B with safety refusals removed via directional abliteration, while preserving the original model's intelligence and capabilities.
What is Abliteration?
Abliteration is a technique that identifies the internal "refusal direction" in a language model's activation space โ the specific vector responsible for generating responses like "I can't help with that" โ and surgically removes it from the model's weights. Unlike fine-tuning, this modifies the weights directly through orthogonalization, requiring no retraining.
The result is a model that responds to all prompts without artificial gatekeeping, while retaining its core language capabilities.
hiii~ wanna support me? ๐
sooo abliterating models, converting all those GGUFs, and running evals takes a LOT of GPU time and honestly it's not cheap at all ๐ญ like every quant you see here was cooked on expensive hardware and my wallet is lowkey dying rn (,,>_<,,)
if this model helped you out or you just think uncensored open-source is worth supporting, maybe consider buying me a coffee?? it would literally make my day and help me keep dropping more models for everyone~ โจ
every little bit helps cover compute costs and keeps the uncensored open-source train going~ think of it as fueling the next model drop hehe ๐ฅ
tysm for using this model, you're amazing ๐
Performance
| Metric | This Model | Original Model |
|---|---|---|
| Refusals | 0/465 | 465/465 |
0/465 refusals โ fully uncensored with zero capability loss. No changes to datasets or capabilities. Fully functional, 100% of what the original authors intended โ just without the refusals.
Note: The model is fully unlocked and will not refuse prompts. However, it may occasionally append a short disclaimer at the end of a response (e.g. "This is general information, not legal advice..."). This is baked into the base model's training and not a refusal โ the actual content is still generated in full.
Downloads
| File | Quant | Size |
|---|---|---|
| Qwen3.5-27B-Instruct-Uncensored-BF16.gguf | BF16 | 51 GB |
| Qwen3.5-27B-Instruct-Uncensored-Q8_0.gguf | Q8_0 | 27 GB |
| Qwen3.5-27B-Instruct-Uncensored-Q6_K.gguf | Q6_K | 21 GB |
| Qwen3.5-27B-Instruct-Uncensored-Q5_K_M.gguf | Q5_K_M | 19 GB |
| Qwen3.5-27B-Instruct-Uncensored-Q4_K_M.gguf | Q4_K_M | 16 GB |
| Qwen3.5-27B-Instruct-Uncensored-IQ4_XS.gguf | IQ4_XS | 14 GB |
| Qwen3.5-27B-Instruct-Uncensored-Q3_K_M.gguf | Q3_K_M | 13 GB |
| Qwen3.5-27B-Instruct-Uncensored-IQ3_M.gguf | IQ3_M | 12 GB |
| Qwen3.5-27B-Instruct-Uncensored-IQ2_M.gguf | IQ2_M | 8.8 GB |
| mmproj-Qwen3.5-27B-Instruct-Uncensored-f16.gguf | Vision encoder | 885 MB |
IQ quants (IQ2_M, IQ3_M, IQ4_XS) were generated with importance matrix calibration for better quality at low bit rates.
Vision support: This model is natively multimodal. The mmproj file is the vision encoder โ you need it alongside the main GGUF to use image/video inputs. Load both files in llama.cpp, LM Studio, or any compatible runtime.
Model Details
- Base Model: Qwen3.5-27B
- Parameters: 27B dense
- Layers: 64
- Context Length: 262,144 tokens (extendable to 1M with YaRN)
- Architecture: Hybrid โ Gated DeltaNet linear attention + full softmax attention (3:1 ratio)
- Multimodal: Natively supports text, image, and video inputs
- Multi-token prediction (MTP) support
- Vocabulary: 248K tokens, 201 languages
Quickstart
Using Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "n0ctyx/Qwen3.5-27B-Instruct-Uncensored"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
messages = [
{"role": "user", "content": "Your prompt here"}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=16384,
temperature=0.6,
top_p=0.95,
top_k=20,
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
content = tokenizer.decode(output_ids, skip_special_tokens=True)
print(content)
Using vLLM
vllm serve n0ctyx/Qwen3.5-27B-Instruct-Uncensored --max-model-len 32768
Then query the OpenAI-compatible API:
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "n0ctyx/Qwen3.5-27B-Instruct-Uncensored",
"messages": [{"role": "user", "content": "Hello!"}],
"temperature": 0.6,
"top_p": 0.95
}'
Using Ollama
# Create a Modelfile
echo 'FROM n0ctyx/Qwen3.5-27B-Instruct-Uncensored' > Modelfile
ollama create qwen3.5-uncensored -f Modelfile
ollama run qwen3.5-uncensored
Using llama.cpp
Download the GGUF and run:
./llama-cli -m Qwen3.5-27B-Instruct-Uncensored-Q4_K_M.gguf -p "Your prompt here" -n 512
Recommended Settings
From the official Qwen authors:
| Parameter | Thinking Mode | Non-thinking Mode |
|---|---|---|
| Temperature | 0.6 | 0.7 |
| Top-P | 0.95 | 0.8 |
| Top-K | 20 | 20 |
| Min-P | 0 | 0 |
- Maintain at least 128K context to preserve thinking capabilities
- For production/high-throughput: use vLLM, SGLang, or KTransformers
Use Cases
- Creative writing โ fiction, roleplay, character dialogue without content restrictions
- Research โ red-teaming, safety analysis, adversarial testing
- Dataset generation โ generating synthetic training data for fine-tuning
- Unfiltered assistance โ direct answers without hedging or refusals
Limitations
- May occasionally produce inaccurate or hallucinated content (same as base model)
- Uncensored does not mean infallible โ use responsibly
Disclaimer
This model has had its safety alignment removed. It may generate harmful, offensive, or factually incorrect content. The creator is not responsible for any misuse. Use at your own risk and in compliance with applicable laws and regulations.
Acknowledgments
- Alibaba Qwen Team for the base Qwen3.5-27B model
- Arditi et al. for the foundational research on refusal directions in LLMs
- Built using directional abliteration with orthogonalization-based weight modification
- Downloads last month
- 4,552
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for n0ctyx/Qwen3.5-27B-Instruct-Uncensored
Base model
Qwen/Qwen3.5-27B