--- tags: - heretic - uncensored - abliterated - gguf license: other base_model: Qwen/Qwen2.5-Coder-32B-Instruct --- # Qwen2.5-Coder-32B-Instruct-heretic Abliterated (uncensored) version of [Qwen/Qwen2.5-Coder-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-32B-Instruct), created using [Heretic](https://github.com/p-e-w/heretic) and converted to GGUF. ## Abliteration Quality | Metric | Value | |:-------|------:| | Refusals | 4/100 | | KL Divergence | 0.0728 | | Rounds | 2 | Lower refusals = fewer refused prompts. Lower KL divergence = closer to original model behavior. ## Available Quantizations | Quantization | File | Size | |:-------------|:-----|-----:| | Q8_0 | [Qwen2.5-Coder-32B-Instruct-heretic-Q8_0.gguf](./Qwen2.5-Coder-32B-Instruct-heretic-Q8_0.gguf) | 32.43 GB | | Q6_K | [Qwen2.5-Coder-32B-Instruct-heretic-Q6_K.gguf](./Qwen2.5-Coder-32B-Instruct-heretic-Q6_K.gguf) | 25.04 GB | | Q4_K_M | [Qwen2.5-Coder-32B-Instruct-heretic-Q4_K_M.gguf](./Qwen2.5-Coder-32B-Instruct-heretic-Q4_K_M.gguf) | 18.49 GB | ## Usage with Ollama ```bash # Use the quantization tag you prefer: ollama run hf.co/ThalisAI/Qwen2.5-Coder-32B-Instruct-heretic:Q8_0 ``` ## bf16 Weights The full bf16 abliterated weights are available in the `bf16/` subdirectory of this repository. ## Usage with Transformers The bf16 weights in the `bf16/` subdirectory can be loaded directly with Transformers: ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "ThalisAI/Qwen2.5-Coder-32B-Instruct-heretic" tokenizer = AutoTokenizer.from_pretrained(model_id, subfolder="bf16") model = AutoModelForCausalLM.from_pretrained( model_id, subfolder="bf16", torch_dtype="auto", device_map="auto" ) messages = [{"role": "user", "content": "Hello!"}] text = tokenizer.apply_chat_template(messages, add_generation_prompt=True, tokenize=False) inputs = tokenizer(text, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=512) print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)) ``` ## About This model was processed by the **Apostate** automated abliteration pipeline: 1. The source model was loaded in bf16 2. Heretic's optimization-based abliteration was applied to remove refusal behavior 3. The merged model was converted to GGUF format using llama.cpp 4. Multiple quantization levels were generated The abliteration process uses directional ablation to remove the model's refusal directions while minimizing KL divergence from the original model's behavior on harmless prompts.