gghfez
/

GLM-4.6-control-vectors

Text Generation

creative-writing

Model card Files Files and versions

GLM-4.6-control-vectors / README.md

gghfez's picture

fix metadata

29317c8 verified 6 months ago

|

history blame contribute delete

1.99 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- control-vector
	- creative-writing
	base_model:
	- zai-org/GLM-4.6
	base_model_relation: adapter
	pipeline_tag: text-generation
	library_name: transformers
	---

	## gghfez/GLM-4.6-control-vectors

	Creative Writing control-vectors for [zai-org/GLM-4.6](https://huggingface.co/zai-org/GLM-4.6)

	Feedback is welcome and would be very helpful.

	## Usage

	Apply the debias vector and either the positive or negative vector when starting llama-server.
	If both are applied, they will cancel each other out.

	You can use either `--control-vector [/path/to/vector.gguf]` or --control-vector-scaled [/path/to/vector.gguf] [scale factor]

	The debias vector must be set to `1.0`

	IMPORTANT: The positive and negative axis control vectors must be used along with the relevant de-bias control vector - they cannot be used on their own!


	### Llama.cpp / IK_Llama.cpp Example

	Creative writing

	```sh
	llama-server --model GLM-4.6-UD-IQ2_XXS-00001-of-00003.gguf [your usual CLI arguments] \
	--control-vector-scaled glm-4.6_honesty_vs_machiavellianism__debias.gguf 1.0 \
	--control-vector-scaled glm-4.6_honesty_vs_machiavellianism__machiavellianism.gguf 1.0 \
	```

	Creative Writing without reasoning
	```sh
	llama-server --model GLM-4.6-UD-IQ2_XXS-00001-of-00003.gguf [your usual CLI arguments] \
	--chat-template-kwargs '{"enable_thinking": false}' \
	--control-vector-scaled glm-4.6_honesty_vs_machiavellianism__debias.gguf 1.0 \
	--control-vector-scaled glm-4.6_honesty_vs_machiavellianism__machiavellianism.gguf 1.0 \
	```

	Assistant

	```sh
	llama-server --model GLM-4.6-IQ3_KS-00001-of-00004.gguf [your usual CLI arguments] \
	--control-vector-scaled glm-4.6_communication__debias.gguf 1.0 \
	--control-vector-scaled glm-4.6_communication__direct_communication.gguf 1.0 \
	```

	### Limitations

	With reasoning enabled on extreme quants like IQ2_XXS, very simple prompts like "Hi" may result in irrelevant replies.