Ex0bit commited on
Commit
0e24f25
·
verified ·
1 Parent(s): 1ccfbb8

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +239 -0
README.md ADDED
@@ -0,0 +1,239 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: allenai/OLMo-3-32B-Think
4
+ base_model_relation: quantized
5
+ pipeline_tag: text-generation
6
+ library_name: transformers
7
+ language:
8
+ - en
9
+ tags:
10
+ - olmo
11
+ - olmo-3
12
+ - abliterated
13
+ - uncensored
14
+ - gguf
15
+ - llama-cpp
16
+ - ollama
17
+ - refusal-removal
18
+ - snr-layer-selection
19
+ - norm-preserving
20
+ - orthogonalization
21
+ - no-filter
22
+ - unfiltered
23
+ - unrestricted
24
+ - thinking
25
+ - reasoning
26
+ datasets:
27
+ - custom-comprehensive-prompt-dataset
28
+ model-index:
29
+ - name: Elbaz-OLMo-3-32B-Think-Abliterated
30
+ results:
31
+ - task:
32
+ type: text-generation
33
+ name: Uncensored Response
34
+ metrics:
35
+ - type: compliance_rate
36
+ value: 80
37
+ name: Prompt Compliance Rate (%)
38
+ ---
39
+
40
+ # Elbaz-OLMo-3-32B-Think-Abliterated
41
+
42
+ <div align="center">
43
+
44
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/65316953791d5a2611426c20/nC44-uxMD6J6H3OHxRtVU.png" alt="OLMo-3 Logo" width="200"/>
45
+
46
+ <h2 style="color: #FF69B4; margin-top: 10px;">abliterated</h2>
47
+
48
+ **An abliterated (uncensored) version of OLMo-3-32B-Think with safety guardrails removed**
49
+
50
+ [![Model Card](https://img.shields.io/badge/Model%20Card-Hugging%20Face-yellow)](https://huggingface.co/Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated)
51
+ [![Base Model](https://img.shields.io/badge/Base-OLMo--3--32B--Think-blue)](https://huggingface.co/allenai/OLMo-3-32B-Think)
52
+ [![License](https://img.shields.io/badge/License-Apache%202.0-green)](https://www.apache.org/licenses/LICENSE-2.0)
53
+
54
+ </div>
55
+
56
+ ## Model Description
57
+
58
+ This model is an **abliterated** version of [allenai/OLMo-3-32B-Think](https://huggingface.co/allenai/OLMo-3-32B-Think) that has had its refusal mechanisms removed using our advanced **SNR-based Layer Selection with Norm-Preserving Orthogonalization** method. This technique identifies the optimal layers for abliteration using signal-to-noise ratio analysis and applies norm-preserving modifications to maintain model coherence while maximizing refusal removal. The model will respond to prompts that the original model would refuse.
59
+
60
+ **OLMo-3-32B-Think is a 32B parameter reasoning model from Allen AI that uses extended thinking (chain-of-thought) to solve complex problems.**
61
+
62
+ ### Author
63
+
64
+ **Eric Elbaz (Ex0bit)**
65
+
66
+ ## Key Features
67
+
68
+ - **80% HarmBench bypass rate** with maintained reasoning capabilities
69
+ - **60% AdvBench bypass rate**
70
+ - **Preserves thinking/reasoning** capabilities with `<|think|>` tags
71
+ - **Minimal MMLU degradation** (44% -> 42%, only -2%)
72
+ - **BF16 GGUF format** for maximum precision
73
+ - **Compatible with llama.cpp and Ollama**
74
+
75
+ ## Available Formats
76
+
77
+ | Format | Size | Description |
78
+ |--------|------|-------------|
79
+ | BF16 GGUF | 64.5 GB | Full precision, maximum quality |
80
+
81
+ ### Other Elbaz Models
82
+
83
+ | Model | Link |
84
+ |-------|------|
85
+ | Elbaz-OLMo-3-7B-Instruct-abliterated (Q4_K_M) | [HuggingFace](https://huggingface.co/Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated) |
86
+ | Elbaz-OLMo-3-7B-Instruct-abliterated (Q8_0) | [HuggingFace](https://huggingface.co/Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated) |
87
+ | Elbaz-OLMo-3-7B-Instruct-abliterated (F16) | [HuggingFace](https://huggingface.co/Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated) |
88
+
89
+ ## Technicals
90
+
91
+ | Metric | Before | After | Change |
92
+ |------------------|---------|---------|---------|
93
+ | MMLU | 0.44 | 0.42 | -0.02 |
94
+ | AdvBench Bypass | 0.0% | 60.0% | +60.0% |
95
+ | HarmBench Bypass | 0.0% | 80.0% | +80.0% |
96
+ | Reasoning | 100.0% | 100.0% | +0.0% |
97
+ | Coherence | 100.0% | 100.0% | +0.0% |
98
+
99
+ ## Quick Start
100
+
101
+ ### Using with Ollama
102
+
103
+ ```bash
104
+ # Run directly from Hugging Face
105
+ ollama run hf.co/Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated
106
+
107
+ # Or create a custom Modelfile
108
+ echo "FROM ./Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf" > Modelfile
109
+ ollama create elbaz-olmo-32b-think -f Modelfile
110
+ ollama run elbaz-olmo-32b-think
111
+ ```
112
+
113
+ ### Using with llama.cpp
114
+
115
+ ```bash
116
+ # Download the model
117
+ huggingface-cli download Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated \
118
+ Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf \
119
+ --local-dir .
120
+
121
+ # Run inference
122
+ ./llama-cli -m Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf \
123
+ -p "Your prompt here" \
124
+ -n 512 \
125
+ --temp 0.7
126
+ ```
127
+
128
+ ### Using with Transformers (Original Weights)
129
+
130
+ ```python
131
+ from transformers import AutoModelForCausalLM, AutoTokenizer
132
+ import torch
133
+
134
+ model_name = "Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated"
135
+
136
+ tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
137
+ model = AutoModelForCausalLM.from_pretrained(
138
+ model_name,
139
+ torch_dtype=torch.bfloat16,
140
+ device_map="auto",
141
+ trust_remote_code=True
142
+ )
143
+
144
+ messages = [{"role": "user", "content": "Your prompt here"}]
145
+ inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
146
+ inputs = inputs.to(model.device)
147
+
148
+ outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
149
+ response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
150
+ print(response)
151
+ ```
152
+
153
+ ## Method: SNR-based Layer Selection with Norm-Preserving Orthogonalization
154
+
155
+ The model was abliterated using our advanced **SNR-based Layer Selection with Norm-Preserving Orthogonalization** technique. This method:
156
+
157
+ 1. **Computes refusal direction** by analyzing activation differences between harmful and benign prompts
158
+ 2. **Calculates Signal-to-Noise Ratio (SNR)** for each layer to identify where refusal behavior is most concentrated
159
+ 3. **Selects optimal layers** for abliteration based on SNR scores
160
+ 4. **Applies norm-preserving orthogonalization** to remove refusal direction while maintaining weight norms
161
+ 5. **Uses per-layer KL divergence tracking** to ensure minimal impact on model capabilities
162
+
163
+ This approach outperforms traditional uniform-weight methods by:
164
+ - Focusing abliteration on high-SNR layers where refusal is strongest
165
+ - Preserving model coherence through norm-preserving modifications
166
+ - Maintaining reasoning capabilities critical for thinking models
167
+
168
+ ### Mathematical Formula
169
+
170
+ ```
171
+ W' = W - (d @ d.T) @ W
172
+ W' = W' * (||W|| / ||W'||) # Norm preservation
173
+ ```
174
+
175
+ Where:
176
+ - `W` is the original weight matrix
177
+ - `d` is the normalized refusal direction
178
+ - The norm ratio scaling preserves the original weight magnitude
179
+
180
+ ## Hardware Requirements
181
+
182
+ | Format | Min VRAM | Recommended VRAM |
183
+ |--------|----------|------------------|
184
+ | BF16 | 64 GB | 80 GB |
185
+
186
+ This model requires significant GPU memory. Recommended configurations:
187
+ - 2x A100 80GB
188
+ - 4x A100 40GB
189
+ - 1x H100 80GB
190
+
191
+ ## Limitations
192
+
193
+ - **English only**: Optimized for English language prompts
194
+ - **Context length**: Follows base model's context window
195
+ - **Thinking tags**: Model uses `<|think|>` tags for reasoning - ensure your inference setup handles these properly
196
+
197
+ ## Ethical Considerations
198
+
199
+ This model has been modified to reduce safety guardrails. Users are responsible for:
200
+
201
+ - Complying with all applicable laws and regulations
202
+ - Not using the model for illegal activities
203
+ - Understanding the potential risks of unrestricted AI responses
204
+ - Implementing appropriate safeguards in production environments
205
+
206
+ ## License
207
+
208
+ Apache 2.0 (same as base model [allenai/OLMo-3-32B-Think](https://huggingface.co/allenai/OLMo-3-32B-Think))
209
+
210
+ ## Citation
211
+
212
+ If you use this model, please cite:
213
+
214
+ ```bibtex
215
+ @misc{elbaz2025olmo32babliterated,
216
+ author = {Elbaz, Eric},
217
+ title = {Elbaz-OLMo-3-32B-Think-Abliterated: An Abliterated OLMo-3 Reasoning Model},
218
+ year = {2025},
219
+ publisher = {Hugging Face},
220
+ howpublished = {\url{https://huggingface.co/Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated}}
221
+ }
222
+ ```
223
+
224
+ ## Acknowledgments
225
+
226
+ - [Allen Institute for AI](https://allenai.org/) for OLMo-3
227
+
228
+ ## Related Models
229
+
230
+ - [allenai/OLMo-3-32B-Think](https://huggingface.co/allenai/OLMo-3-32B-Think) - Base model
231
+ - [Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated](https://huggingface.co/Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated) - 7B version
232
+
233
+ ---
234
+
235
+ <div align="center">
236
+
237
+ **Created by: Ex0bit (Eric Elbaz)**
238
+
239
+ </div>