YanLabs
/

Seed-OSS-36B-Instruct-MPOA

Text Generation

Model card Files Files and versions

Seed-OSS-36B-Instruct-MPOA / README.md

YanLabs's picture

Create README.md

5ef43be verified 21 days ago

|

history blame contribute delete

2.38 kB

	---
	license: apache-2.0
	base_model:
	- ByteDance-Seed/Seed-OSS-36B-Instruct
	pipeline_tag: text-generation
	---


	# YanLabs/Seed-OSS-36B-Instruct-MPOA


	This is an abliterated version of [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct) using the norm-preserving biprojected abliteration technique.

	⚠️ Warning: Safety guardrails and refusal mechanisms have been removed through abliteration. This model may generate harmful content and is intended for mechanistic interpretability research only.

	## Model Details

	### Model Description

	This model applies norm-preserving biprojected abliteration to remove refusal behaviors while preserving the model's original capabilities. The technique surgically removes "refusal directions" from the model's activation space without traditional fine-tuning.

	- Developed by: YanLabs
	- Model type: Causal Language Model (Transformer)
	- License: apache-2.0
	- Base model: [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct)

	### Model Sources

	- Base Model: [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct)
	- Abliteration Tool: [jim-plus/llm-abliteration](https://github.com/jim-plus/llm-abliteration)
	- Paper: [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration)

	## Uses

	### Intended Use

	- Research: Mechanistic interpretability studies
	- Analysis: Understanding LLM safety mechanisms
	- Development: Testing abliteration techniques

	### Out-of-Scope Use

	- ❌ Production deployments
	- ❌ User-facing applications
	- ❌ Generating harmful content for malicious purposes

	## Limitations

	- Abliteration does not guarantee complete removal of all refusals
	- May generate unsafe or harmful content
	- Model behavior may be unpredictable in edge cases
	- No explicit harm prevention mechanisms remain

	## Citation

	If you use this model in your research, please cite:

	```bibtex
	@misc{Seed-OSS-36B-Instruct-MPOA,
	author = {YanLabs},
	title = {Seed-OSS-36B-Instruct-MPOA},
	year = {2025},
	publisher = {HuggingFace},
	howpublished = {\url{https://huggingface.co/YanLabs/Seed-OSS-36B-Instruct-MPOA}},
	note = {Abliterated using norm-preserving biprojected technique}
	}