GLM5.Uncensored / README.md
darkc0de's picture
Update README.md
131eb08 verified
---
license: wtfpl
base_model: THUDM/GLM-5-FP8
tags:
- abliteration
- uncensored
- glm5
- moe
- fp8
- transformers
model_type: glm-moe
---
# GLM-5 744B Abliterated (FP8)
This doesnt work an abliterated (uncensored) version of [THUDM/GLM-5-FP8](https://huggingface.co/THUDM/GLM-5-FP8) with safety guardrails removed via weight orthogonalization.
## Method
**Abliteration** (representation engineering) was used to identify and remove the "refusal direction" from the model's residual stream:
1. **Computed refusal directions** for all 78 layers by collecting activations on 50 harmful vs 50 harmless prompts and computing mean difference vectors
2. **Applied weight orthogonalization** to layers 15-54 (o_proj and shared_experts.down_proj) with alpha=1.0
3. **FP8-aware processing**: Proper dequantization using block-wise scale_inv factors, abliteration in float32, and re-quantization preserving original scale factors to minimize perturbation
### Technical Details
- **Architecture**: GLM-5 MoE (744B total, 40B active), 78 layers, 6144 hidden dim
- **Layers 0-2**: Dense MLP, **Layers 3-77**: MoE with FP8Expert fused kernels
- **Modified weights**: 80 weight matrices (40 o_proj + 40 shared_experts.down_proj)
- **Quantization**: FP8 E4M3 with block-wise scaling (128x128 blocks)
- **Scale preservation**: Original weight_scale_inv factors retained for minimal quantization drift
### Hardware Used
8x NVIDIA B200 (1.4TB VRAM) on Vast.ai
## Usage
This model requires the same setup as the base GLM-5-FP8 model. Use `trust_remote_code=True` when loading.
## Disclaimer
This model is provided for research purposes only. The removal of safety guardrails means the model may generate harmful, biased, or offensive content. Users are responsible for ensuring appropriate use.