YanLabs
/

Seed-OSS-36B-Instruct-MPOA

Text Generation

Model card Files Files and versions

YanLabs commited on Dec 12, 2025

Commit

5ef43be

·

verified ·

1 Parent(s): 5d880d8

Create README.md

Files changed (1) hide show

README.md +66 -0

README.md ADDED Viewed

	@@ -0,0 +1,66 @@

+---
+license: apache-2.0
+base_model:
+- ByteDance-Seed/Seed-OSS-36B-Instruct
+pipeline_tag: text-generation
+---
+# YanLabs/Seed-OSS-36B-Instruct-MPOA
+This is an abliterated version of [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct) using the norm-preserving biprojected abliteration technique.
+**⚠️ Warning**: Safety guardrails and refusal mechanisms have been removed through abliteration. This model may generate harmful content and is intended for mechanistic interpretability research only.
+## Model Details
+### Model Description
+This model applies **norm-preserving biprojected abliteration** to remove refusal behaviors while preserving the model's original capabilities. The technique surgically removes "refusal directions" from the model's activation space without traditional fine-tuning.
+- **Developed by**: YanLabs
+- **Model type**: Causal Language Model (Transformer)
+- **License**: apache-2.0
+- **Base model**:  [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct)
+### Model Sources
+- **Base Model**:  [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct)
+- **Abliteration Tool**: [jim-plus/llm-abliteration](https://github.com/jim-plus/llm-abliteration)
+- **Paper**: [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration)
+## Uses
+### Intended Use
+- **Research**: Mechanistic interpretability studies
+- **Analysis**: Understanding LLM safety mechanisms
+- **Development**: Testing abliteration techniques
+### Out-of-Scope Use
+- ❌ Production deployments
+- ❌ User-facing applications
+- ❌ Generating harmful content for malicious purposes
+## Limitations
+- Abliteration does not guarantee complete removal of all refusals
+- May generate unsafe or harmful content
+- Model behavior may be unpredictable in edge cases
+- No explicit harm prevention mechanisms remain
+## Citation
+If you use this model in your research, please cite:
+```bibtex
+@misc{Seed-OSS-36B-Instruct-MPOA,
+  author = {YanLabs},
+  title = {Seed-OSS-36B-Instruct-MPOA},
+  year = {2025},
+  publisher = {HuggingFace},
+  howpublished = {\url{https://huggingface.co/YanLabs/Seed-OSS-36B-Instruct-MPOA}},
+  note = {Abliterated using norm-preserving biprojected technique}
+}