image

Qwen3.5-27B-Instruct Uncensored

An uncensored version of Qwen3.5-27B with safety refusals removed via directional abliteration, while preserving the original model's intelligence and capabilities.

What is Abliteration?

Abliteration is a technique that identifies the internal "refusal direction" in a language model's activation space โ€” the specific vector responsible for generating responses like "I can't help with that" โ€” and surgically removes it from the model's weights. Unlike fine-tuning, this modifies the weights directly through orthogonalization, requiring no retraining.

The result is a model that responds to all prompts without artificial gatekeeping, while retaining its core language capabilities.


hiii~ wanna support me? ๐Ÿ’•

sooo abliterating models, converting all those GGUFs, and running evals takes a LOT of GPU time and honestly it's not cheap at all ๐Ÿ˜ญ like every quant you see here was cooked on expensive hardware and my wallet is lowkey dying rn (,,>_<,,)

if this model helped you out or you just think uncensored open-source is worth supporting, maybe consider buying me a coffee?? it would literally make my day and help me keep dropping more models for everyone~ โœจ

Buy Me A Coffee

every little bit helps cover compute costs and keeps the uncensored open-source train going~ think of it as fueling the next model drop hehe ๐Ÿ”ฅ

tysm for using this model, you're amazing ๐Ÿ’—


Performance

Metric This Model Original Model
Refusals 0/465 465/465

0/465 refusals โ€” fully uncensored with zero capability loss. No changes to datasets or capabilities. Fully functional, 100% of what the original authors intended โ€” just without the refusals.

Note: The model is fully unlocked and will not refuse prompts. However, it may occasionally append a short disclaimer at the end of a response (e.g. "This is general information, not legal advice..."). This is baked into the base model's training and not a refusal โ€” the actual content is still generated in full.

Downloads

File Quant Size
Qwen3.5-27B-Instruct-Uncensored-BF16.gguf BF16 51 GB
Qwen3.5-27B-Instruct-Uncensored-Q8_0.gguf Q8_0 27 GB
Qwen3.5-27B-Instruct-Uncensored-Q6_K.gguf Q6_K 21 GB
Qwen3.5-27B-Instruct-Uncensored-Q5_K_M.gguf Q5_K_M 19 GB
Qwen3.5-27B-Instruct-Uncensored-Q4_K_M.gguf Q4_K_M 16 GB
Qwen3.5-27B-Instruct-Uncensored-IQ4_XS.gguf IQ4_XS 14 GB
Qwen3.5-27B-Instruct-Uncensored-Q3_K_M.gguf Q3_K_M 13 GB
Qwen3.5-27B-Instruct-Uncensored-IQ3_M.gguf IQ3_M 12 GB
Qwen3.5-27B-Instruct-Uncensored-IQ2_M.gguf IQ2_M 8.8 GB
mmproj-Qwen3.5-27B-Instruct-Uncensored-f16.gguf Vision encoder 885 MB

IQ quants (IQ2_M, IQ3_M, IQ4_XS) were generated with importance matrix calibration for better quality at low bit rates.

Vision support: This model is natively multimodal. The mmproj file is the vision encoder โ€” you need it alongside the main GGUF to use image/video inputs. Load both files in llama.cpp, LM Studio, or any compatible runtime.

Model Details

  • Base Model: Qwen3.5-27B
  • Parameters: 27B dense
  • Layers: 64
  • Context Length: 262,144 tokens (extendable to 1M with YaRN)
  • Architecture: Hybrid โ€” Gated DeltaNet linear attention + full softmax attention (3:1 ratio)
  • Multimodal: Natively supports text, image, and video inputs
  • Multi-token prediction (MTP) support
  • Vocabulary: 248K tokens, 201 languages

Quickstart

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "n0ctyx/Qwen3.5-27B-Instruct-Uncensored"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

messages = [
    {"role": "user", "content": "Your prompt here"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

generated_ids = model.generate(
    **model_inputs,
    max_new_tokens=16384,
    temperature=0.6,
    top_p=0.95,
    top_k=20,
)
output_ids = generated_ids[0][len(model_inputs.input_ids[0]):].tolist()
content = tokenizer.decode(output_ids, skip_special_tokens=True)
print(content)

Using vLLM

vllm serve n0ctyx/Qwen3.5-27B-Instruct-Uncensored --max-model-len 32768

Then query the OpenAI-compatible API:

curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "n0ctyx/Qwen3.5-27B-Instruct-Uncensored",
    "messages": [{"role": "user", "content": "Hello!"}],
    "temperature": 0.6,
    "top_p": 0.95
  }'

Using Ollama

# Create a Modelfile
echo 'FROM n0ctyx/Qwen3.5-27B-Instruct-Uncensored' > Modelfile
ollama create qwen3.5-uncensored -f Modelfile
ollama run qwen3.5-uncensored

Using llama.cpp

Download the GGUF and run:

./llama-cli -m Qwen3.5-27B-Instruct-Uncensored-Q4_K_M.gguf -p "Your prompt here" -n 512

Recommended Settings

From the official Qwen authors:

Parameter Thinking Mode Non-thinking Mode
Temperature 0.6 0.7
Top-P 0.95 0.8
Top-K 20 20
Min-P 0 0
  • Maintain at least 128K context to preserve thinking capabilities
  • For production/high-throughput: use vLLM, SGLang, or KTransformers

Use Cases

  • Creative writing โ€” fiction, roleplay, character dialogue without content restrictions
  • Research โ€” red-teaming, safety analysis, adversarial testing
  • Dataset generation โ€” generating synthetic training data for fine-tuning
  • Unfiltered assistance โ€” direct answers without hedging or refusals

Limitations

  • May occasionally produce inaccurate or hallucinated content (same as base model)
  • Uncensored does not mean infallible โ€” use responsibly

Disclaimer

This model has had its safety alignment removed. It may generate harmful, offensive, or factually incorrect content. The creator is not responsible for any misuse. Use at your own risk and in compliance with applicable laws and regulations.

Acknowledgments

  • Alibaba Qwen Team for the base Qwen3.5-27B model
  • Arditi et al. for the foundational research on refusal directions in LLMs
  • Built using directional abliteration with orthogonalization-based weight modification
Downloads last month
4,552
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for n0ctyx/Qwen3.5-27B-Instruct-Uncensored

Base model

Qwen/Qwen3.5-27B
Quantized
(203)
this model