QKVM Phi Weights — Unconscious Memory for Qwen3-30B-A3B

Trainable CoupledWriteFunction (phi) weights that produce personality-differentiated "unconscious memory" M-states for the frozen Qwen3-30B-A3B model.

What is this?

QKVM modulates a frozen LLM's Q and V attention projections using low-rank memory matrices built from processing "reflection" text through trainable write functions (phi). Different reflection content produces different M-states, which cause the model to generate text with genuinely different cognitive styles — without any fine-tuning of the base model.

Results

Metric	Value
First-token accuracy (personality)	12/12 (100%)
First-token accuracy (diagnostic)	16/18 (89%)
PPL wins	7/12
M-state cosine similarity	0.052 (near-orthogonal)
KL between M-state distributions	4-17
Unique generations per prompt	5/5

Example generations (career advice prompt):

Analytical: "Before you make a decision, think about what you're giving up. Stability is a form of freedom..."
Bold: "Go for it. You're not going to get a better time than now. The only thing standing between you and your dream..."
Empathetic: "How do you think they'll handle the uncertainty? What's the worst that could happen..."
Pragmatic: "What if I told you that the most successful people didn't have a plan — they had a hypothesis..."

Files

phi_weights.safetensors — CoupledWriteFunction parameters (the trainable phi)
mod_scales.safetensors — Per-layer Q/V modulation scaling factors
qkvm_config.json — All hyperparameters needed to reconstruct
seeds/ — Pre-computed M/E states for each mindset (ready to use)
lora_adapters/ — PEFT-compatible LoRA adapters for each personality (for vLLM/PEFT)

Usage with LoRA adapters (easiest)

from peft import PeftModel
from transformers import AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-30B-A3B", ...)
model = PeftModel.from_pretrained(model, "dgonier/unconscious_memories_phi_weights",
                                   subfolder="lora_adapters/analytical")
# Now generates with analytical persona

Training config

Base model: Qwen3-30B-A3B (48 layers, d_model=2048, MoE)
QKVM layers: All 48 (stride=1)
Memory rank: 16
Epochs: 300
Key losses: first-token matching (0.5), contrastive (3.0), discriminative (1.0)
Init noise: M=2.0, E=2.0 (critical for symmetry breaking)

License

Same as the base model (Qwen3-30B-A3B).

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support