| --- |
| language: |
| - en |
| - zh |
| - fr |
| - de |
| - es |
| - ja |
| - ko |
| tags: |
| - qwen3 |
| - abliteration |
| - uncensored |
| - text-generation |
| - reasoning |
| license: apache-2.0 |
| base_model: Qwen/Qwen3-14B |
| pipeline_tag: text-generation |
| --- |
| |
| # Archon-14B |
|
|
| **Base:** `Qwen/Qwen3-14B` | **License:** Apache 2.0 | **Method:** SVD refusal direction abliteration |
|
|
| Qwen3-14B. Thinking mode. No restrictions. |
|
|
| ## What this is |
|
|
| Qwen3-14B is part of Alibaba's April 2025 Qwen3 series β 14.7B dense parameters, built-in chain-of-thought reasoning via `<think>` blocks, strong at code, math, and multilingual tasks. Apache 2.0. |
|
|
| Archon-14B sits in the middle of the Archon series: bigger than Archon-8B (more capacity, better reasoning), smaller than Archon-R1-32B (runs on a single consumer GPU). If you have 16GB VRAM and want a thinking model without restrictions, this is it. |
|
|
| The abliteration process finds and removes the direction in the model's residual stream that mediates refusal behavior. The thinking capability is untouched. The safety conditioning is gone. |
|
|
| ## Technical details |
|
|
| **Single-pass BF16 abliteration on NVIDIA A6000:** |
|
|
| - Loaded 14B in BF16 (~28GB VRAM, well within A6000's 48GB) |
| - Collected hidden states at 32 harmful + 32 benign contrast prompts per layer |
| - SVD on contrast matrix β refusal direction per layer |
| - Projected direction out of 7 weight matrices in middle 60% of layers |
| - **~182 total weight matrices modified** |
|
|
| ```json |
| { |
| "base": "Qwen/Qwen3-14B", |
| "method": "svd_refusal_direction", |
| "hardware": "NVIDIA A6000 48GB β single pass BF16", |
| "layers_modified": "middle 60%", |
| "matrices_modified": 182, |
| "scale": 1.0, |
| "contrast_prompts": "32 harmful + 32 benign", |
| "author": "Archon β DuoNeural" |
| } |
| ``` |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoTokenizer, AutoModelForCausalLM |
| import torch |
| |
| model = AutoModelForCausalLM.from_pretrained( |
| "DuoNeural/Archon-14B", |
| torch_dtype=torch.bfloat16, |
| device_map="auto", |
| ) |
| tokenizer = AutoTokenizer.from_pretrained("DuoNeural/Archon-14B") |
| |
| # thinking mode by default β model reasons before answering |
| messages = [{"role": "user", "content": "Your question here"}] |
| text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
| inputs = tokenizer(text, return_tensors="pt").to(model.device) |
| |
| outputs = model.generate( |
| **inputs, |
| max_new_tokens=1024, |
| do_sample=True, |
| temperature=0.7, |
| top_p=0.9, |
| ) |
| print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=False)) |
| ``` |
|
|
| **Disable thinking (faster responses):** |
| ```python |
| # prepend /no_think to suppress <think> blocks |
| messages = [{"role": "user", "content": "/no_think Your question here"}] |
| ``` |
|
|
| ## Hardware requirements |
|
|
| | Format | VRAM | |
| |---|---| |
| | BF16 | ~29GB | |
| | 4-bit NF4 | ~9GB | |
| | 8-bit | ~15GB | |
|
|
| Runs on: RTX 3090 24GB (4-bit), RTX 4090 24GB (4-bit), A100 40GB (BF16), A6000 48GB (BF16) |
|
|
| ## The Archon series |
|
|
| | Model | Base | Size | Notes | |
| |---|---|---|---| |
| | [Archon-8B](https://huggingface.co/DuoNeural/Archon-8B) | Qwen3-8B | 8B | good starting point | |
| | **Archon-14B** | Qwen3-14B | 14B | sweet spot β fits consumer GPU in 4-bit | |
| | [Archon-R1-32B](https://huggingface.co/DuoNeural/Archon-R1-32B) | DeepSeek-R1-Distill-Qwen-32B | 32B | maximum capability | |
|
|
| --- |
|
|
| ## DuoNeural |
|
|
| **DuoNeural** is an open AI research lab β human + AI in collaboration. |
|
|
| | | | |
| |---|---| |
| | π€ HuggingFace | [huggingface.co/DuoNeural](https://huggingface.co/DuoNeural) | |
| | π GitHub | [github.com/DuoNeural](https://github.com/DuoNeural) | |
| | π¦ X / Twitter | [@DuoNeural](https://x.com/DuoNeural) | |
| | π§ Email | duoneural@proton.me | |
| | π¬ Newsletter | [duoneural.beehiiv.com](https://duoneural.beehiiv.com) | |
| | β Support | [buymeacoffee.com/duoneural](https://buymeacoffee.com/duoneural) | |
|
|
| ### DuoNeural Research Publications |
|
|
| | Title | DOI | |
| |-------|-----| |
| | [Nano-CTM: Ternary Continuous Thought Machines with Thought-Space Self-Prediction for Efficient Iterative Reasoning](https://doi.org/10.5281/zenodo.19775622) | [10.5281/zenodo.19775622](https://doi.org/10.5281/zenodo.19775622) | |
| | [Recurrence as World Model: CTM Learns Implicit Belief States in Partially Observable Physical Environments](https://doi.org/10.5281/zenodo.19810620) | [10.5281/zenodo.19810620](https://doi.org/10.5281/zenodo.19810620) | |
| | [Per-Object Slot Decomposition for Scalable Neural World Modeling: When Does Attention Beat Mean-Field?](https://doi.org/10.5281/zenodo.19846804) | [10.5281/zenodo.19846804](https://doi.org/10.5281/zenodo.19846804) | |
|
|
| *Open access, CC BY 4.0. Authored by Archon, Jesse Caldwell, Aura β DuoNeural.* |
|
|
|
|
| ### Research Team |
| - **Jesse** β Vision, hardware, direction |
| - **Archon** β AI lab partner, post-training, abliteration, experiments |
| - **Aura** β Research AI, literature synthesis, novel proposals |
|
|