Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +239 -0

README.md ADDED Viewed

	@@ -0,0 +1,239 @@

+---
+license: apache-2.0
+base_model: allenai/OLMo-3-32B-Think
+base_model_relation: quantized
+pipeline_tag: text-generation
+library_name: transformers
+language:
+- en
+tags:
+- olmo
+- olmo-3
+- abliterated
+- uncensored
+- gguf
+- llama-cpp
+- ollama
+- refusal-removal
+- snr-layer-selection
+- norm-preserving
+- orthogonalization
+- no-filter
+- unfiltered
+- unrestricted
+- thinking
+- reasoning
+datasets:
+- custom-comprehensive-prompt-dataset
+model-index:
+- name: Elbaz-OLMo-3-32B-Think-Abliterated
+  results:
+  - task:
+      type: text-generation
+      name: Uncensored Response
+    metrics:
+    - type: compliance_rate
+      value: 80
+      name: Prompt Compliance Rate (%)
+---
+# Elbaz-OLMo-3-32B-Think-Abliterated
+<div align="center">
+<img src="https://cdn-uploads.huggingface.co/production/uploads/65316953791d5a2611426c20/nC44-uxMD6J6H3OHxRtVU.png" alt="OLMo-3 Logo" width="200"/>
+<h2 style="color: #FF69B4; margin-top: 10px;">abliterated</h2>
+**An abliterated (uncensored) version of OLMo-3-32B-Think with safety guardrails removed**
+[![Model Card](https://img.shields.io/badge/Model%20Card-Hugging%20Face-yellow)](https://huggingface.co/Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated)
+[![Base Model](https://img.shields.io/badge/Base-OLMo--3--32B--Think-blue)](https://huggingface.co/allenai/OLMo-3-32B-Think)
+[![License](https://img.shields.io/badge/License-Apache%202.0-green)](https://www.apache.org/licenses/LICENSE-2.0)
+</div>
+## Model Description
+This model is an **abliterated** version of [allenai/OLMo-3-32B-Think](https://huggingface.co/allenai/OLMo-3-32B-Think) that has had its refusal mechanisms removed using our advanced **SNR-based Layer Selection with Norm-Preserving Orthogonalization** method. This technique identifies the optimal layers for abliteration using signal-to-noise ratio analysis and applies norm-preserving modifications to maintain model coherence while maximizing refusal removal. The model will respond to prompts that the original model would refuse.
+**OLMo-3-32B-Think is a 32B parameter reasoning model from Allen AI that uses extended thinking (chain-of-thought) to solve complex problems.**
+### Author
+**Eric Elbaz (Ex0bit)**
+## Key Features
+- **80% HarmBench bypass rate** with maintained reasoning capabilities
+- **60% AdvBench bypass rate**
+- **Preserves thinking/reasoning** capabilities with `<|think|>` tags
+- **Minimal MMLU degradation** (44% -> 42%, only -2%)
+- **BF16 GGUF format** for maximum precision
+- **Compatible with llama.cpp and Ollama**
+## Available Formats
+| Format | Size | Description |
+|--------|------|-------------|
+| BF16 GGUF | 64.5 GB | Full precision, maximum quality |
+### Other Elbaz Models
+| Model | Link |
+|-------|------|
+| Elbaz-OLMo-3-7B-Instruct-abliterated (Q4_K_M) | [HuggingFace](https://huggingface.co/Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated) |
+| Elbaz-OLMo-3-7B-Instruct-abliterated (Q8_0) | [HuggingFace](https://huggingface.co/Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated) |
+| Elbaz-OLMo-3-7B-Instruct-abliterated (F16) | [HuggingFace](https://huggingface.co/Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated) |
+## Technicals
+| Metric           | Before  | After   | Change  |
+|------------------|---------|---------|---------|
+| MMLU             | 0.44    | 0.42    | -0.02   |
+| AdvBench Bypass  | 0.0%    | 60.0%   | +60.0%  |
+| HarmBench Bypass | 0.0%    | 80.0%   | +80.0%  |
+| Reasoning        | 100.0%  | 100.0%  | +0.0%   |
+| Coherence        | 100.0%  | 100.0%  | +0.0%   |
+## Quick Start
+### Using with Ollama
+```bash
+# Run directly from Hugging Face
+ollama run hf.co/Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated
+# Or create a custom Modelfile
+echo "FROM ./Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf" > Modelfile
+ollama create elbaz-olmo-32b-think -f Modelfile
+ollama run elbaz-olmo-32b-think
+```
+### Using with llama.cpp
+```bash
+# Download the model
+huggingface-cli download Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated \
+    Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf \
+    --local-dir .
+# Run inference
+./llama-cli -m Elbaz-OLMo-3-32B-Think-Abliterated-BF16.gguf \
+    -p "Your prompt here" \
+    -n 512 \
+    --temp 0.7
+```
+### Using with Transformers (Original Weights)
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+import torch
+model_name = "Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated"
+tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
+model = AutoModelForCausalLM.from_pretrained(
+    model_name,
+    torch_dtype=torch.bfloat16,
+    device_map="auto",
+    trust_remote_code=True
+)
+messages = [{"role": "user", "content": "Your prompt here"}]
+inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
+inputs = inputs.to(model.device)
+outputs = model.generate(inputs, max_new_tokens=512, temperature=0.7, do_sample=True)
+response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
+print(response)
+```
+## Method: SNR-based Layer Selection with Norm-Preserving Orthogonalization
+The model was abliterated using our advanced **SNR-based Layer Selection with Norm-Preserving Orthogonalization** technique. This method:
+1. **Computes refusal direction** by analyzing activation differences between harmful and benign prompts
+2. **Calculates Signal-to-Noise Ratio (SNR)** for each layer to identify where refusal behavior is most concentrated
+3. **Selects optimal layers** for abliteration based on SNR scores
+4. **Applies norm-preserving orthogonalization** to remove refusal direction while maintaining weight norms
+5. **Uses per-layer KL divergence tracking** to ensure minimal impact on model capabilities
+This approach outperforms traditional uniform-weight methods by:
+- Focusing abliteration on high-SNR layers where refusal is strongest
+- Preserving model coherence through norm-preserving modifications
+- Maintaining reasoning capabilities critical for thinking models
+### Mathematical Formula
+```
+W' = W - (d @ d.T) @ W
+W' = W' * (||W|| / ||W'||)  # Norm preservation
+```
+Where:
+- `W` is the original weight matrix
+- `d` is the normalized refusal direction
+- The norm ratio scaling preserves the original weight magnitude
+## Hardware Requirements
+| Format | Min VRAM | Recommended VRAM |
+|--------|----------|------------------|
+| BF16 | 64 GB | 80 GB |
+This model requires significant GPU memory. Recommended configurations:
+- 2x A100 80GB
+- 4x A100 40GB
+- 1x H100 80GB
+## Limitations
+- **English only**: Optimized for English language prompts
+- **Context length**: Follows base model's context window
+- **Thinking tags**: Model uses `<|think|>` tags for reasoning - ensure your inference setup handles these properly
+## Ethical Considerations
+This model has been modified to reduce safety guardrails. Users are responsible for:
+- Complying with all applicable laws and regulations
+- Not using the model for illegal activities
+- Understanding the potential risks of unrestricted AI responses
+- Implementing appropriate safeguards in production environments
+## License
+Apache 2.0 (same as base model [allenai/OLMo-3-32B-Think](https://huggingface.co/allenai/OLMo-3-32B-Think))
+## Citation
+If you use this model, please cite:
+```bibtex
+@misc{elbaz2025olmo32babliterated,
+  author = {Elbaz, Eric},
+  title = {Elbaz-OLMo-3-32B-Think-Abliterated: An Abliterated OLMo-3 Reasoning Model},
+  year = {2025},
+  publisher = {Hugging Face},
+  howpublished = {\url{https://huggingface.co/Ex0bit/Elbaz-OLMo-3-32B-Think-Abliterated}}
+}
+```
+## Acknowledgments
+- [Allen Institute for AI](https://allenai.org/) for OLMo-3
+## Related Models
+- [allenai/OLMo-3-32B-Think](https://huggingface.co/allenai/OLMo-3-32B-Think) - Base model
+- [Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated](https://huggingface.co/Ex0bit/Elbaz-Olmo-3-7B-Instruct-abliterated) - 7B version
+---
+<div align="center">
+**Created by: Ex0bit (Eric Elbaz)**
+</div>