File size: 2,379 Bytes
5ef43be |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
---
license: apache-2.0
base_model:
- ByteDance-Seed/Seed-OSS-36B-Instruct
pipeline_tag: text-generation
---
# YanLabs/Seed-OSS-36B-Instruct-MPOA
This is an abliterated version of [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct) using the norm-preserving biprojected abliteration technique.
**⚠️ Warning**: Safety guardrails and refusal mechanisms have been removed through abliteration. This model may generate harmful content and is intended for mechanistic interpretability research only.
## Model Details
### Model Description
This model applies **norm-preserving biprojected abliteration** to remove refusal behaviors while preserving the model's original capabilities. The technique surgically removes "refusal directions" from the model's activation space without traditional fine-tuning.
- **Developed by**: YanLabs
- **Model type**: Causal Language Model (Transformer)
- **License**: apache-2.0
- **Base model**: [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct)
### Model Sources
- **Base Model**: [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct)
- **Abliteration Tool**: [jim-plus/llm-abliteration](https://github.com/jim-plus/llm-abliteration)
- **Paper**: [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration)
## Uses
### Intended Use
- **Research**: Mechanistic interpretability studies
- **Analysis**: Understanding LLM safety mechanisms
- **Development**: Testing abliteration techniques
### Out-of-Scope Use
- ❌ Production deployments
- ❌ User-facing applications
- ❌ Generating harmful content for malicious purposes
## Limitations
- Abliteration does not guarantee complete removal of all refusals
- May generate unsafe or harmful content
- Model behavior may be unpredictable in edge cases
- No explicit harm prevention mechanisms remain
## Citation
If you use this model in your research, please cite:
```bibtex
@misc{Seed-OSS-36B-Instruct-MPOA,
author = {YanLabs},
title = {Seed-OSS-36B-Instruct-MPOA},
year = {2025},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/YanLabs/Seed-OSS-36B-Instruct-MPOA}},
note = {Abliterated using norm-preserving biprojected technique}
} |