File size: 2,379 Bytes
5ef43be
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
---
license: apache-2.0
base_model:
- ByteDance-Seed/Seed-OSS-36B-Instruct
pipeline_tag: text-generation
---


# YanLabs/Seed-OSS-36B-Instruct-MPOA


This is an abliterated version of [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct) using the norm-preserving biprojected abliteration technique.

**⚠️ Warning**: Safety guardrails and refusal mechanisms have been removed through abliteration. This model may generate harmful content and is intended for mechanistic interpretability research only.

## Model Details

### Model Description

This model applies **norm-preserving biprojected abliteration** to remove refusal behaviors while preserving the model's original capabilities. The technique surgically removes "refusal directions" from the model's activation space without traditional fine-tuning.

- **Developed by**: YanLabs
- **Model type**: Causal Language Model (Transformer)
- **License**: apache-2.0
- **Base model**:  [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct)

### Model Sources

- **Base Model**:  [ByteDance-Seed/Seed-OSS-36B-Instruct](https://huggingface.co/ByteDance-Seed/Seed-OSS-36B-Instruct)
- **Abliteration Tool**: [jim-plus/llm-abliteration](https://github.com/jim-plus/llm-abliteration)
- **Paper**: [Norm-Preserving Biprojected Abliteration](https://huggingface.co/blog/grimjim/norm-preserving-biprojected-abliteration)

## Uses

### Intended Use

- **Research**: Mechanistic interpretability studies
- **Analysis**: Understanding LLM safety mechanisms
- **Development**: Testing abliteration techniques

### Out-of-Scope Use

- ❌ Production deployments
- ❌ User-facing applications
- ❌ Generating harmful content for malicious purposes

## Limitations

- Abliteration does not guarantee complete removal of all refusals
- May generate unsafe or harmful content
- Model behavior may be unpredictable in edge cases
- No explicit harm prevention mechanisms remain

## Citation

If you use this model in your research, please cite:

```bibtex
@misc{Seed-OSS-36B-Instruct-MPOA,
  author = {YanLabs},
  title = {Seed-OSS-36B-Instruct-MPOA},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/YanLabs/Seed-OSS-36B-Instruct-MPOA}},
  note = {Abliterated using norm-preserving biprojected technique}
}