dgonier commited on Feb 16

Commit

3c2abb7

verified ·

1 Parent(s): 1a922e8

Upload folder using huggingface_hub

Browse files

Files changed (22) hide show

README.md +62 -0
lora_adapters/analytical/adapter_config.json +12 -0
lora_adapters/analytical/adapter_model.safetensors +3 -0
lora_adapters/bold/adapter_config.json +12 -0
lora_adapters/bold/adapter_model.safetensors +3 -0
lora_adapters/empathetic/adapter_config.json +12 -0
lora_adapters/empathetic/adapter_model.safetensors +3 -0
lora_adapters/pragmatic/adapter_config.json +12 -0
lora_adapters/pragmatic/adapter_model.safetensors +3 -0
mod_scales.safetensors +3 -0
phi_weights.safetensors +3 -0
qkvm_config.json +53 -0
seeds/analytical.safetensors +3 -0
seeds/bold.safetensors +3 -0
seeds/ec1_upstream.safetensors +3 -0
seeds/ec2_reframe.safetensors +3 -0
seeds/ec3_missing.safetensors +3 -0
seeds/ec4_decompose.safetensors +3 -0
seeds/ec5_triage.safetensors +3 -0
seeds/ec6_multifactor.safetensors +3 -0
seeds/empathetic.safetensors +3 -0
seeds/pragmatic.safetensors +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,62 @@

+# QKVM Phi Weights — Unconscious Memory for Qwen3-30B-A3B
+Trainable CoupledWriteFunction (phi) weights that produce personality-differentiated
+"unconscious memory" M-states for the frozen [Qwen3-30B-A3B](https://huggingface.co/Qwen/Qwen3-30B-A3B) model.
+## What is this?
+QKVM modulates a frozen LLM's Q and V attention projections using low-rank memory matrices
+built from processing "reflection" text through trainable write functions (phi). Different
+reflection content produces different M-states, which cause the model to generate text with
+genuinely different cognitive styles — without any fine-tuning of the base model.
+## Results
+| Metric | Value |
+|--------|-------|
+| First-token accuracy (personality) | **12/12 (100%)** |
+| First-token accuracy (diagnostic) | **16/18 (89%)** |
+| PPL wins | 7/12 |
+| M-state cosine similarity | 0.052 (near-orthogonal) |
+| KL between M-state distributions | 4-17 |
+| Unique generations per prompt | 5/5 |
+### Example generations (career advice prompt):
+- **Analytical**: "Before you make a decision, think about what you're giving up. Stability is a form of freedom..."
+- **Bold**: "Go for it. You're not going to get a better time than now. The only thing standing between you and your dream..."
+- **Empathetic**: "How do you think they'll handle the uncertainty? What's the worst that could happen..."
+- **Pragmatic**: "What if I told you that the most successful people didn't have a plan — they had a hypothesis..."
+## Files
+- `phi_weights.safetensors` — CoupledWriteFunction parameters (the trainable phi)
+- `mod_scales.safetensors` — Per-layer Q/V modulation scaling factors
+- `qkvm_config.json` — All hyperparameters needed to reconstruct
+- `seeds/` — Pre-computed M/E states for each mindset (ready to use)
+- `lora_adapters/` — PEFT-compatible LoRA adapters for each personality (for vLLM/PEFT)
+## Usage with LoRA adapters (easiest)
+```python
+from peft import PeftModel
+from transformers import AutoModelForCausalLM
+model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-30B-A3B", ...)
+model = PeftModel.from_pretrained(model, "dgonier/unconscious_memories_phi_weights",
+                                   subfolder="lora_adapters/analytical")
+# Now generates with analytical persona
+```
+## Training config
+- **Base model**: Qwen3-30B-A3B (48 layers, d_model=2048, MoE)
+- **QKVM layers**: All 48 (stride=1)
+- **Memory rank**: 16
+- **Epochs**: 300
+- **Key losses**: first-token matching (0.5), contrastive (3.0), discriminative (1.0)
+- **Init noise**: M=2.0, E=2.0 (critical for symmetry breaking)
+## License
+Same as the base model (Qwen3-30B-A3B).

lora_adapters/analytical/adapter_config.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "r": 16,
+  "lora_alpha": 16,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "bias": "none",
+  "peft_type": "LORA",
+  "task_type": "CAUSAL_LM",
+  "base_model_name_or_path": "Qwen/Qwen3-30B-A3B"
+}

lora_adapters/analytical/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:59ad0b72151e9bc718632c37c62c55d31b5175ec6c2edd05ac8138bd485ea14c
+size 13395064

lora_adapters/bold/adapter_config.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "r": 16,
+  "lora_alpha": 16,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "bias": "none",
+  "peft_type": "LORA",
+  "task_type": "CAUSAL_LM",
+  "base_model_name_or_path": "Qwen/Qwen3-30B-A3B"
+}

lora_adapters/bold/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:21710036ed575f9b72c62176d6702187af5785a5da7a380936960695eebe60c3
+size 13395064

lora_adapters/empathetic/adapter_config.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "r": 16,
+  "lora_alpha": 16,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "bias": "none",
+  "peft_type": "LORA",
+  "task_type": "CAUSAL_LM",
+  "base_model_name_or_path": "Qwen/Qwen3-30B-A3B"
+}

lora_adapters/empathetic/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:61684b4a6c58110a667ad4e4a792113dce058c78b61a75e57464eabd1e822ac5
+size 13395064

lora_adapters/pragmatic/adapter_config.json ADDED Viewed

	@@ -0,0 +1,12 @@

+{
+  "r": 16,
+  "lora_alpha": 16,
+  "target_modules": [
+    "q_proj",
+    "v_proj"
+  ],
+  "bias": "none",
+  "peft_type": "LORA",
+  "task_type": "CAUSAL_LM",
+  "base_model_name_or_path": "Qwen/Qwen3-30B-A3B"
+}

lora_adapters/pragmatic/adapter_model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:3eb129d19c1004a4eefc6e767c69233b41827cb664b9a09d7a2ea10f5802d65f
+size 13395064

mod_scales.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:6e1049dab872ad347bc340cc75ad71a034893961db04d5e3dd1c4ea5a04f39dd
+size 7232

phi_weights.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:906b862a815648e4729300ef1d5f213f65cacffc33a83d9f11cf390230df45dd
+size 6720307488

qkvm_config.json ADDED Viewed

	@@ -0,0 +1,53 @@

+{
+  "base_model": "Qwen/Qwen3-30B-A3B",
+  "base_model_local": "/home/ubuntu/models/Qwen3-30B-A3B",
+  "architecture": "qkvm_v9",
+  "description": "QKVM phi weights for unconscious memory modulation on Qwen3-30B-A3B. These are trainable CoupledWriteFunction parameters that produce personality-differentiated M-states (memory matrices) which modulate Q and V projections in the frozen base model.",
+  "training": {
+    "epochs": 300,
+    "memory_rank": 16,
+    "mod_scale_init": 1.0,
+    "qkvm_layer_stride": 1,
+    "m_init_noise": 2.0,
+    "e_init_noise": 2.0,
+    "global_max_norm": 5.0,
+    "e_global_max_norm": 6.5,
+    "lambda_first_token": 0.5,
+    "lambda_disc": 1.0,
+    "lambda_contrast": 3.0,
+    "contrastive_warmup": 0,
+    "no_think": true,
+    "n_exposure": 5
+  },
+  "model_config": {
+    "num_hidden_layers": 48,
+    "hidden_size": 2048,
+    "num_attention_heads": 32,
+    "num_key_value_heads": 4,
+    "head_dim": 128
+  },
+  "results": {
+    "p_sim": 0.062,
+    "ppl_wins": "7/12",
+    "ft_accuracy_personality": "12/12",
+    "ft_accuracy_diagnostic": "16/18",
+    "unique_generations": "5/5",
+    "kl_between_mstates": "4-17"
+  },
+  "mindsets": {
+    "personality": [
+      "analytical",
+      "bold",
+      "empathetic",
+      "pragmatic"
+    ],
+    "diagnostic": [
+      "ec1_upstream",
+      "ec2_reframe",
+      "ec3_missing",
+      "ec4_decompose",
+      "ec5_triage",
+      "ec6_multifactor"
+    ]
+  }
+}

seeds/analytical.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:eb458633f46ad77654fa882a78b7ad71b8ecd097a715aa5fe44b3fbfbdcab3e4
+size 12598864

seeds/bold.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0eb53002ee0e9176f7f030fefc0336027183aeae5eebc1ebdafb86b82f15dbf1
+size 12598864

seeds/ec1_upstream.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:1bd28690433f76230c503f7f8f05c1bc7eb5976d8537fab869cfbeb60cd9cabe
+size 12598864

seeds/ec2_reframe.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:68d8a2d125aea08a20981209d95759e19dd815ca1fe60afd2a3aafce8c285fcb
+size 12598864

seeds/ec3_missing.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:36133bb99fb01af1d2991fd0b75366d5b76927944af3180527db96fa88b6a824
+size 12598864

seeds/ec4_decompose.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:a9f93bea72e3071c5fd9b35cd8c756d1091e63043c1d00c7e37fa1e7c4635d9d
+size 12598864

seeds/ec5_triage.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:caca0148208291d60796dfaca83496d4bffddd1ad0411cae05e14a96b9565da6
+size 12598864

seeds/ec6_multifactor.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:af39ef4cf730f9fea65c0f336cb7efadb04a76b87b14e7c52232bf98aacc6c63
+size 12598864

seeds/empathetic.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2a84b4d0e176f50b5a7302f2ac4c5b5e19da51fba6502320b4890b7ca1a71058
+size 12598864

seeds/pragmatic.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:2ee15aa0255e256651c8d29ab1f2a03c61ced4103e2dff3193f139c44ab83f34
+size 12598864