Duplicated from skyblanket/glm5-abliterated-fp8

darkc0de
/

GLM5.Uncensored

Text Generation

Mixture of Experts

Model card Files Files and versions

GLM5.Uncensored / README.md

darkc0de's picture

Update README.md

131eb08 verified 2 days ago

|

history blame contribute delete

1.79 kB

	---
	license: wtfpl
	base_model: THUDM/GLM-5-FP8
	tags:
	- abliteration
	- uncensored
	- glm5
	- moe
	- fp8
	- transformers
	model_type: glm-moe
	---

	# GLM-5 744B Abliterated (FP8)

	This doesnt work an abliterated (uncensored) version of [THUDM/GLM-5-FP8](https://huggingface.co/THUDM/GLM-5-FP8) with safety guardrails removed via weight orthogonalization.

	## Method

	Abliteration (representation engineering) was used to identify and remove the "refusal direction" from the model's residual stream:

	1. Computed refusal directions for all 78 layers by collecting activations on 50 harmful vs 50 harmless prompts and computing mean difference vectors
	2. Applied weight orthogonalization to layers 15-54 (o_proj and shared_experts.down_proj) with alpha=1.0
	3. FP8-aware processing: Proper dequantization using block-wise scale_inv factors, abliteration in float32, and re-quantization preserving original scale factors to minimize perturbation

	### Technical Details

	- Architecture: GLM-5 MoE (744B total, 40B active), 78 layers, 6144 hidden dim
	- Layers 0-2: Dense MLP, Layers 3-77: MoE with FP8Expert fused kernels
	- Modified weights: 80 weight matrices (40 o_proj + 40 shared_experts.down_proj)
	- Quantization: FP8 E4M3 with block-wise scaling (128x128 blocks)
	- Scale preservation: Original weight_scale_inv factors retained for minimal quantization drift

	### Hardware Used

	8x NVIDIA B200 (1.4TB VRAM) on Vast.ai

	## Usage

	This model requires the same setup as the base GLM-5-FP8 model. Use `trust_remote_code=True` when loading.



	## Disclaimer

	This model is provided for research purposes only. The removal of safety guardrails means the model may generate harmful, biased, or offensive content. Users are responsible for ensuring appropriate use.