anthonym21 commited on
Commit
a9a21f4
·
verified ·
1 Parent(s): fa8b6ad

Deploy Slipstream paper Space with Live Quantizer

Browse files
README.md CHANGED
@@ -1,17 +1,36 @@
1
- ---
2
- title: Slipcore
3
- emoji: 💬
4
- colorFrom: yellow
5
- colorTo: purple
6
- sdk: gradio
7
- sdk_version: 5.42.0
8
- app_file: app.py
9
- pinned: false
10
- hf_oauth: true
11
- hf_oauth_scopes:
12
- - inference-api
13
- license: cc-by-4.0
14
- short_description: Slipstream Agentic Communication Protocol
15
- ---
16
-
17
- An example chatbot using [Gradio](https://gradio.app), [`huggingface_hub`](https://huggingface.co/docs/huggingface_hub/v0.22.2/en/index), and the [Hugging Face Inference API](https://huggingface.co/docs/api-inference/index).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: "Slipstream: Semantic Quantization for Multi-Agent Coordination"
3
+ emoji: 📄
4
+ colorFrom: blue
5
+ colorTo: indigo
6
+ sdk: gradio
7
+ app_file: app.py
8
+ pinned: false
9
+ license: mit
10
+ tags: ["semantic-quantization", "multi-agent-systems", "protocol-standards", "token-efficiency"]
11
+ ---
12
+
13
+ # Slipstream: Semantic Quantization for Efficient Multi-Agent Coordination
14
+
15
+ This Space was generated from a research paper PDF.
16
+
17
+ ## What you can do here
18
+
19
+ - **Live Quantizer**: Type messy natural language and watch it get quantized to a UCR anchor (the core demo!)
20
+ - **Start here**: guided entry points (summary / limitations / thread)
21
+ - **Gallery**: extracted figures or page previews
22
+ - **Chat**: ask questions about the paper
23
+ - **Share Kit**: generate a tweet thread / talk outline / FAQ
24
+ - **Model Playground**: chat with a referenced HF model (requires `HF_TOKEN`)
25
+
26
+ ## Optional secrets
27
+
28
+ If you add these as Space secrets, Chat + Share Kit become generative:
29
+
30
+ - `HF_TOKEN`: Hugging Face token (read access is sufficient for inference; write is **not** needed at runtime)
31
+ - `PAPER_LLM_MODEL`: e.g. `meta-llama/Meta-Llama-3-8B-Instruct` (or any chat-completion capable model)
32
+
33
+ ## Build provenance
34
+
35
+ - Source PDF: `slipstream-paper.pdf`
36
+ - Extracted pages: 7
__pycache__/app.cpython-313.pyc ADDED
Binary file (26.5 kB). View file
 
__pycache__/quantizer.cpython-313.pyc ADDED
Binary file (37.8 kB). View file
 
__pycache__/ucr.cpython-313.pyc ADDED
Binary file (14.8 kB). View file
 
app.py CHANGED
@@ -1,70 +1,521 @@
1
- import gradio as gr
2
- from huggingface_hub import InferenceClient
3
-
4
-
5
- def respond(
6
- message,
7
- history: list[dict[str, str]],
8
- system_message,
9
- max_tokens,
10
- temperature,
11
- top_p,
12
- hf_token: gr.OAuthToken,
13
- ):
14
- """
15
- For more information on `huggingface_hub` Inference API support, please check the docs: https://huggingface.co/docs/huggingface_hub/v0.22.2/en/guides/inference
16
- """
17
- client = InferenceClient(token=hf_token.token, model="openai/gpt-oss-20b")
18
-
19
- messages = [{"role": "system", "content": system_message}]
20
-
21
- messages.extend(history)
22
-
23
- messages.append({"role": "user", "content": message})
24
-
25
- response = ""
26
-
27
- for message in client.chat_completion(
28
- messages,
29
- max_tokens=max_tokens,
30
- stream=True,
31
- temperature=temperature,
32
- top_p=top_p,
33
- ):
34
- choices = message.choices
35
- token = ""
36
- if len(choices) and choices[0].delta.content:
37
- token = choices[0].delta.content
38
-
39
- response += token
40
- yield response
41
-
42
-
43
- """
44
- For information on how to customize the ChatInterface, peruse the gradio docs: https://www.gradio.app/docs/chatinterface
45
- """
46
- chatbot = gr.ChatInterface(
47
- respond,
48
- type="messages",
49
- additional_inputs=[
50
- gr.Textbox(value="You are a friendly Chatbot.", label="System message"),
51
- gr.Slider(minimum=1, maximum=2048, value=512, step=1, label="Max new tokens"),
52
- gr.Slider(minimum=0.1, maximum=4.0, value=0.7, step=0.1, label="Temperature"),
53
- gr.Slider(
54
- minimum=0.1,
55
- maximum=1.0,
56
- value=0.95,
57
- step=0.05,
58
- label="Top-p (nucleus sampling)",
59
- ),
60
- ],
61
- )
62
-
63
- with gr.Blocks() as demo:
64
- with gr.Sidebar():
65
- gr.LoginButton()
66
- chatbot.render()
67
-
68
-
69
- if __name__ == "__main__":
70
- demo.launch()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import json
2
+ import os
3
+ import re
4
+ from dataclasses import dataclass
5
+ from typing import Dict, List, Tuple, Optional
6
+
7
+ import gradio as gr
8
+
9
+ # Lazy-loaded quantizer
10
+ _quantizer = None
11
+ _quantizer_error = None
12
+
13
+ def _get_quantizer():
14
+ """Lazy load the embedding quantizer."""
15
+ global _quantizer, _quantizer_error
16
+ if _quantizer is not None:
17
+ return _quantizer
18
+ if _quantizer_error is not None:
19
+ return None
20
+ try:
21
+ from quantizer import EmbeddingQuantizer
22
+ _quantizer = EmbeddingQuantizer(fallback_threshold=0.3)
23
+ return _quantizer
24
+ except Exception as e:
25
+ _quantizer_error = str(e)
26
+ return None
27
+
28
+ def _get_keyword_quantizer():
29
+ """Fallback to keyword quantizer."""
30
+ try:
31
+ from quantizer import KeywordQuantizer
32
+ return KeywordQuantizer()
33
+ except Exception:
34
+ return None
35
+
36
+
37
+ # ---------------------------
38
+ # Loaded at build time
39
+ # ---------------------------
40
+ PAPER_TITLE = "Slipstream: Semantic Quantization for Efficient Multi-Agent Coordination"
41
+ PAPER_AUTHORS = "Anthony Maio"
42
+ PAPER_ABSTRACT = "As multi-agent LLM systems scale,coordination bandwidthbecomes a primary cost\ndriver: every token spent on routing, intent framing, and redundant context is paid repeat-\nedly across agents and turns. Current approaches waste 40\u201360% of compute on coordination\noverhead, with communication costs scalingO(n2)as agent counts increase.\nThis paper introducesSlipstream, a protocol that performssemantic quantization:\nmapping free-form messages onto a sharedUniversal Concept Reference (UCR)and\ntransmitting compactmnemonic anchorsthat identify structured intents. Unlike syn-\ntactic compression (which fails due to BPE tokenizer fragmentation), Slipstream transmits\nnatural-language mnemonics that tokenize efficiently across model architectures.\nSlipstream combines (1) a symbolic4D semantic manifold\u2014Action, Polarity, Domain,\nUrgency\u2014with (2) a data-drivenvector engine(embeddings + nearest-centroid retrieval)\nplus anevolutionary extension layerthat learns new anchors from low-confidence traf-\nfic. Results show82% token reduction(41.9\u21927.4 tokens average) while maintaining\nsemantic fidelity, making large-scale multi-agent deployments economically viable."
43
+ PAPER_TAGS = "semantic-quantization, multi-agent-systems, protocol-standards, token-ef-"
44
+ DEFAULT_LLM_MODEL = None
45
+ DETECTED_MODELS = []
46
+
47
+
48
+ # ---------------------------
49
+ # Robust file loading
50
+ # ---------------------------
51
+ def _load_chunks(path: str = "paper_chunks.jsonl") -> List[str]:
52
+ chunks: List[str] = []
53
+ try:
54
+ with open(path, "r", encoding="utf-8") as f:
55
+ for line in f:
56
+ line = line.strip()
57
+ if not line:
58
+ continue
59
+ try:
60
+ rec = json.loads(line)
61
+ txt = (rec.get("text") or "").strip()
62
+ if txt:
63
+ chunks.append(txt)
64
+ except Exception:
65
+ continue
66
+ except FileNotFoundError:
67
+ return []
68
+ except Exception:
69
+ return []
70
+ return chunks
71
+
72
+
73
+ # ---------------------------
74
+ # Tiny BM25-ish retrieval (pure Python)
75
+ # ---------------------------
76
+ def _tokenize(s: str) -> List[str]:
77
+ return re.findall(r"[A-Za-z0-9][A-Za-z0-9_-]{1,}", s.lower())
78
+
79
+
80
+ @dataclass
81
+ class Retriever:
82
+ chunks: List[str]
83
+ doc_tokens: List[List[str]]
84
+ df: Dict[str, int]
85
+ idf: Dict[str, float]
86
+
87
+ @staticmethod
88
+ def build(chunks: List[str]) -> "Retriever":
89
+ doc_tokens = [_tokenize(c) for c in chunks]
90
+ df: Dict[str, int] = {}
91
+ for toks in doc_tokens:
92
+ for t in set(toks):
93
+ df[t] = df.get(t, 0) + 1
94
+ n = max(1, len(doc_tokens))
95
+ idf = {}
96
+ for t, d in df.items():
97
+ idf[t] = float((n - d + 0.5) / (d + 0.5))
98
+ return Retriever(chunks=chunks, doc_tokens=doc_tokens, df=df, idf=idf)
99
+
100
+ def topk(self, query: str, k: int = 4) -> List[Tuple[int, float]]:
101
+ q = _tokenize(query)
102
+ if not q:
103
+ return []
104
+ scores: List[Tuple[int, float]] = []
105
+ qset = set(q)
106
+ for i, toks in enumerate(self.doc_tokens):
107
+ if not toks:
108
+ continue
109
+ overlap = qset.intersection(toks)
110
+ if not overlap:
111
+ continue
112
+ score = 0.0
113
+ for t in overlap:
114
+ score += self.idf.get(t, 0.0)
115
+ score = score / (1.0 + (len(toks) / 200.0))
116
+ scores.append((i, score))
117
+ scores.sort(key=lambda x: x[1], reverse=True)
118
+ return scores[:k]
119
+
120
+
121
+ CHUNKS = _load_chunks()
122
+ RETRIEVER = Retriever.build(CHUNKS) if CHUNKS else None
123
+
124
+
125
+ def retrieve_context(query: str, k: int = 4, max_chars: int = 6000) -> str:
126
+ if not RETRIEVER:
127
+ return ""
128
+ hits = RETRIEVER.topk(query, k=k)
129
+ parts: List[str] = []
130
+ for idx, _score in hits:
131
+ txt = CHUNKS[idx].strip()
132
+ if txt:
133
+ parts.append(txt)
134
+ ctx = "\n\n".join(parts).strip()
135
+ return ctx[:max_chars]
136
+
137
+
138
+ # ---------------------------
139
+ # HF Inference helpers (optional)
140
+ # ---------------------------
141
+ def _get_hf_client(model_id: str):
142
+ try:
143
+ from huggingface_hub import InferenceClient
144
+ except Exception as e:
145
+ raise RuntimeError("huggingface_hub is not installed. Add it to requirements.txt") from e
146
+
147
+ token = os.environ.get("HF_TOKEN") or os.environ.get("HUGGINGFACEHUB_API_TOKEN")
148
+ if not token:
149
+ raise RuntimeError("HF_TOKEN is not set as a Space secret.")
150
+ return InferenceClient(model=model_id, token=token)
151
+
152
+
153
+ def _llm_chat(model_id: str, messages: List[dict], max_tokens: int = 512) -> str:
154
+ client = _get_hf_client(model_id)
155
+ out = client.chat_completion(messages=messages, max_tokens=max_tokens)
156
+ return out.choices[0].message.content
157
+
158
+
159
+ # ---------------------------
160
+ # Chat with paper (RAG-lite)
161
+ # ---------------------------
162
+ def paper_chat(message: str, history: List[Tuple[str, str]]) -> str:
163
+ message = (message or "").strip()
164
+ if not message:
165
+ return "Ask a question about the paper."
166
+
167
+ ctx = retrieve_context(message, k=4, max_chars=6000)
168
+
169
+ model_id = os.environ.get("PAPER_LLM_MODEL") or DEFAULT_LLM_MODEL
170
+ if not model_id:
171
+ if not ctx:
172
+ return "No indexed context found. (paper_chunks.jsonl missing?)"
173
+ return "Top matches in the paper:\n\n" + ctx[:1200]
174
+
175
+ trimmed_history = history[-4:] if history else []
176
+ messages: List[dict] = [
177
+ {
178
+ "role": "system",
179
+ "content": (
180
+ "You are a precise research assistant. Answer using ONLY the provided paper context. "
181
+ "If the context is insufficient, say what is missing and point to what section would help."
182
+ ),
183
+ },
184
+ ]
185
+ if ctx:
186
+ messages.append({"role": "system", "content": "PAPER CONTEXT:\n\n" + ctx})
187
+
188
+ for u, a in trimmed_history:
189
+ messages.append({"role": "user", "content": u})
190
+ messages.append({"role": "assistant", "content": a})
191
+
192
+ messages.append({"role": "user", "content": message})
193
+
194
+ try:
195
+ return _llm_chat(model_id, messages, max_tokens=512).strip()
196
+ except Exception as e:
197
+ if ctx:
198
+ return f"(LLM unavailable: {e})\n\nTop matches in the paper:\n\n" + ctx[:1200]
199
+ return f"LLM unavailable: {e}"
200
+
201
+
202
+ # ---------------------------
203
+ # Share Kit (generators)
204
+ # ---------------------------
205
+ def _fallback_tweet_thread(title: str, abstract: str) -> str:
206
+ abs_one = re.sub(r"\s+", " ", abstract).strip()
207
+ bullets = [
208
+ f"1/ {title}",
209
+ "2/ TL;DR: " + (abs_one[:220] + ("…" if len(abs_one) > 220 else "")),
210
+ "3/ Key idea: (open the Space → Chat tab and ask for the method overview)",
211
+ "4/ Try it: use the Share Kit tab to generate a talk outline / FAQ.",
212
+ "5/ Links: add your paper + code links in the README.",
213
+ ]
214
+ return "\n\n".join(bullets)
215
+
216
+ def generate_share(kind: str) -> str:
217
+ kind = (kind or "").strip().lower()
218
+ base_title = PAPER_TITLE or "Paper"
219
+ base_abs = PAPER_ABSTRACT or ""
220
+
221
+ model_id = os.environ.get("PAPER_LLM_MODEL") or DEFAULT_LLM_MODEL
222
+ if not model_id:
223
+ if kind == "tweet thread":
224
+ return _fallback_tweet_thread(base_title, base_abs)
225
+ if kind == "talk outline":
226
+ return "\n".join([
227
+ f"Title: {base_title}",
228
+ "- Motivation",
229
+ "- Problem setup",
230
+ "- Method",
231
+ "- Results",
232
+ "- Limitations",
233
+ "- Q&A",
234
+ ])
235
+ if kind == "faq":
236
+ return "\n".join([
237
+ "Q: What problem does this address?\nA: " + (base_abs[:220] + ("…" if len(base_abs) > 220 else "")),
238
+ "",
239
+ "Q: What is the main contribution?\nA: Ask in the Chat tab.",
240
+ "",
241
+ "Q: How do I reproduce it?\nA: Link code + add steps in README.",
242
+ ])
243
+ return "Select an item to generate."
244
+
245
+ prompt = {
246
+ "tweet thread": "Write a concise 6-tweet thread summarizing the paper for the ML community.",
247
+ "talk outline": "Create a 10-minute talk outline with section headers and bullet points.",
248
+ "faq": "Write an FAQ with 6 Q/A pairs focused on method, results, limitations, and usage.",
249
+ }.get(kind, "Summarize the paper in 8 bullet points.")
250
+
251
+ ctx = (PAPER_ABSTRACT or "").strip()
252
+ messages = [
253
+ {"role": "system", "content": "You are an expert technical writer for ML research audiences."},
254
+ {"role": "user", "content": f"Paper title: {base_title}\nAuthors: {PAPER_AUTHORS}\n\nAbstract/context:\n{ctx}\n\nTask: {prompt}"},
255
+ ]
256
+ try:
257
+ return _llm_chat(model_id, messages, max_tokens=600).strip()
258
+ except Exception as e:
259
+ if kind == "tweet thread":
260
+ return _fallback_tweet_thread(base_title, base_abs) + f"\n\n(LLM unavailable: {e})"
261
+ return f"LLM unavailable: {e}"
262
+
263
+
264
+ # ---------------------------
265
+ # Model Playground (chat)
266
+ # ---------------------------
267
+ def model_chat(model_id: str, message: str, history: List[Tuple[str, str]]) -> str:
268
+ model_id = (model_id or "").strip()
269
+ message = (message or "").strip()
270
+ if not model_id:
271
+ return "Provide a model id."
272
+ if not message:
273
+ return "Send a message."
274
+
275
+ messages: List[dict] = [{"role": "system", "content": "You are a helpful assistant."}]
276
+ for u, a in (history[-4:] if history else []):
277
+ messages.append({"role": "user", "content": u})
278
+ messages.append({"role": "assistant", "content": a})
279
+ messages.append({"role": "user", "content": message})
280
+
281
+ try:
282
+ return _llm_chat(model_id, messages, max_tokens=512).strip()
283
+ except Exception as e:
284
+ return f"Model call failed: {e}"
285
+
286
+
287
+ # ---------------------------
288
+ # UI helpers
289
+ # ---------------------------
290
+ def quantize_intent(intent: str) -> Tuple[str, str, str]:
291
+ """
292
+ Quantize a natural language intent to UCR anchor.
293
+ Returns: (primary_result_md, alternatives_md, wire_format)
294
+ """
295
+ intent = (intent or "").strip()
296
+ if not intent:
297
+ return "Enter an intent to quantize.", "", ""
298
+
299
+ # Try embedding quantizer first, fall back to keyword
300
+ quantizer = _get_quantizer()
301
+ method = "embedding"
302
+ if quantizer is None:
303
+ quantizer = _get_keyword_quantizer()
304
+ method = "keyword"
305
+
306
+ if quantizer is None:
307
+ return "Quantizer unavailable. Check logs.", "", ""
308
+
309
+ try:
310
+ result = quantizer.quantize(intent)
311
+ except Exception as e:
312
+ return f"Quantization error: {e}", "", ""
313
+
314
+ # Confidence color
315
+ conf = result.confidence
316
+ if conf >= 0.7:
317
+ color = "green"
318
+ conf_label = "High"
319
+ elif conf >= 0.5:
320
+ color = "orange"
321
+ conf_label = "Medium"
322
+ else:
323
+ color = "red"
324
+ conf_label = "Low"
325
+
326
+ # Primary result
327
+ primary_md = f"""
328
+ ### {result.anchor.mnemonic}
329
+
330
+ **Confidence:** <span style="color:{color}; font-weight:bold">{conf:.0%}</span> ({conf_label})
331
+
332
+ **Canonical meaning:** {result.anchor.canonical}
333
+
334
+ **Method:** {method} {'(fallback)' if result.is_fallback else ''}
335
+
336
+ **Coordinates:** `{result.anchor.coords}` (Action, Polarity, Domain, Urgency)
337
+ """
338
+
339
+ # Alternatives
340
+ if result.alternatives:
341
+ alt_lines = ["| Anchor | Similarity |", "|--------|------------|"]
342
+ for alt_anchor, alt_score in result.alternatives[:3]:
343
+ bar_len = int(alt_score * 10)
344
+ bar = "█" * bar_len + "░" * (10 - bar_len)
345
+ alt_lines.append(f"| {alt_anchor.mnemonic} | {bar} {alt_score:.0%} |")
346
+ alternatives_md = "\n".join(alt_lines)
347
+ else:
348
+ alternatives_md = "*No alternatives*"
349
+
350
+ # Wire format
351
+ wire = f"SLIP v1 user agent {result.anchor.mnemonic}"
352
+ if result.is_fallback:
353
+ # Truncate long intents for fallback payload
354
+ payload = intent[:100].replace('"', "'")
355
+ wire = f'SLIP v1 user agent Fallback "{payload}"'
356
+
357
+ return primary_md, alternatives_md, wire
358
+
359
+
360
+ EXAMPLE_INTENTS = [
361
+ ("Review my code", "RequestReview"),
362
+ ("Task complete!", "InformComplete"),
363
+ ("System down!", "ObserveError"),
364
+ ("Can you help?", "RequestHelp"),
365
+ ("Looks good to me", "EvalApprove"),
366
+ ]
367
+
368
+
369
+ def start_here(choice: str) -> str:
370
+ choice = (choice or "").strip().lower()
371
+ if choice == "quick summary":
372
+ return f"### {PAPER_TITLE}\n\n**Authors:** {PAPER_AUTHORS}\n\n**Abstract:**\n\n{PAPER_ABSTRACT}"
373
+ if choice == "how does it work?":
374
+ return "Go to **Chat** and ask: *Give me a method overview with the key steps.*"
375
+ if choice == "what are the limitations?":
376
+ return "Go to **Chat** and ask: *List limitations and failure modes discussed in the paper.*"
377
+ if choice == "generate a tweet thread":
378
+ return generate_share("tweet thread")
379
+ return "Pick an option."
380
+
381
+
382
+ def _load_gallery_items() -> List[Tuple[str, str]]:
383
+ items: List[Tuple[str, str]] = []
384
+ if os.path.isdir("assets/images"):
385
+ for fn in sorted(os.listdir("assets/images"))[:48]:
386
+ path = os.path.join("assets/images", fn)
387
+ if os.path.isfile(path):
388
+ items.append((path, fn))
389
+ if not items and os.path.isdir("assets/pages"):
390
+ for fn in sorted(os.listdir("assets/pages"))[:24]:
391
+ path = os.path.join("assets/pages", fn)
392
+ if os.path.isfile(path):
393
+ items.append((path, fn))
394
+ return items
395
+
396
+
397
+ CSS = '''
398
+ .paper-hero h1 { margin-bottom: 0.2rem; }
399
+ .paper-hero p { margin-top: 0.2rem; opacity: 0.9; }
400
+ .hint { opacity: 0.85; }
401
+ '''
402
+
403
+ with gr.Blocks(theme=gr.themes.Soft(), css=CSS) as demo:
404
+ gr.Markdown(f"# {PAPER_TITLE}", elem_classes=["paper-hero"])
405
+ if PAPER_AUTHORS:
406
+ gr.Markdown(f"**Authors:** {PAPER_AUTHORS}", elem_classes=["paper-hero"])
407
+ if PAPER_TAGS:
408
+ gr.Markdown(f"**Tags:** {PAPER_TAGS}", elem_classes=["paper-hero"])
409
+
410
+ with gr.Tabs():
411
+ with gr.Tab("Start here"):
412
+ gr.Markdown("Pick an interaction to explore the paper quickly.", elem_classes=["hint"])
413
+ choice = gr.Radio(
414
+ ["Quick summary", "How does it work?", "What are the limitations?", "Generate a tweet thread"],
415
+ value="Quick summary",
416
+ label="What do you want?",
417
+ )
418
+ out = gr.Markdown()
419
+ choice.change(start_here, inputs=choice, outputs=out)
420
+ demo.load(start_here, inputs=choice, outputs=out)
421
+
422
+ with gr.Tab("Overview"):
423
+ gr.Markdown("## Abstract")
424
+ gr.Markdown(PAPER_ABSTRACT)
425
+
426
+ gr.Markdown("---")
427
+ gr.Markdown("### Text search (snippet)")
428
+ q = gr.Textbox(label="Find a phrase", placeholder="e.g., scalable oversight", lines=1)
429
+ snippet = gr.Textbox(label="Top matching context", lines=10)
430
+
431
+ def _snippet(query: str) -> str:
432
+ query = (query or "").strip()
433
+ if not query:
434
+ return ""
435
+ ctx = retrieve_context(query, k=4, max_chars=1600)
436
+ return ctx or "No matches."
437
+
438
+ q.change(_snippet, inputs=q, outputs=snippet)
439
+
440
+ with gr.Tab("Gallery"):
441
+ gr.Markdown("Extracted images / rendered page previews (if included at build time).", elem_classes=["hint"])
442
+ gallery = gr.Gallery(label="Figures / pages", columns=2, rows=2, height=520)
443
+
444
+ def _gallery():
445
+ return _load_gallery_items()
446
+
447
+ demo.load(_gallery, outputs=gallery)
448
+
449
+ with gr.Tab("Chat"):
450
+ gr.Markdown(
451
+ "Ask questions. If you set `HF_TOKEN` + `PAPER_LLM_MODEL` as Space secrets, answers become generative; "
452
+ "otherwise it returns top-matching snippets.",
453
+ elem_classes=["hint"],
454
+ )
455
+ gr.ChatInterface(fn=paper_chat, title="Chat with the Paper")
456
+
457
+ with gr.Tab("Share Kit"):
458
+ gr.Markdown("Generate shareable assets. Works without secrets (deterministic fallback).", elem_classes=["hint"])
459
+ kind = gr.Dropdown(["Tweet thread", "Talk outline", "FAQ"], value="Tweet thread", label="Generate")
460
+ btn = gr.Button("Create")
461
+ share_out = gr.Textbox(lines=14, label="Output")
462
+ btn.click(lambda k: generate_share(k), inputs=kind, outputs=share_out)
463
+
464
+ with gr.Tab("Model Playground"):
465
+ gr.Markdown("Chat with a referenced Hub model (if any) or provide your own. Requires `HF_TOKEN` secret.", elem_classes=["hint"])
466
+ model_id = gr.Dropdown(
467
+ choices=(DETECTED_MODELS if DETECTED_MODELS else []),
468
+ value=(DETECTED_MODELS[0] if DETECTED_MODELS else None),
469
+ label="Model id",
470
+ allow_custom_value=True,
471
+ )
472
+
473
+ def _model_chat_fn(message: str, history: List[Tuple[str, str]], mid: str) -> str:
474
+ return model_chat(mid, message, history)
475
+
476
+ gr.ChatInterface(fn=_model_chat_fn, additional_inputs=[model_id], title="Model Playground")
477
+
478
+ with gr.Tab("Live Quantizer"):
479
+ gr.Markdown("""
480
+ ## Think → Quantize → Transmit
481
+
482
+ Type a messy, natural-language intent and watch it get quantized to a UCR anchor.
483
+ This demonstrates the core Slipstream innovation: mapping free-form language onto a shared semantic manifold.
484
+ """)
485
+
486
+ with gr.Row():
487
+ with gr.Column(scale=2):
488
+ intent_input = gr.Textbox(
489
+ label="Your intent (natural language)",
490
+ placeholder="Hey, I'm kinda stuck on this auth bug, can you take a look?",
491
+ lines=2,
492
+ )
493
+ quantize_btn = gr.Button("Quantize", variant="primary")
494
+
495
+ gr.Markdown("**Try these examples:**")
496
+ with gr.Row():
497
+ for ex_text, ex_anchor in EXAMPLE_INTENTS:
498
+ ex_btn = gr.Button(ex_text, size="sm")
499
+ ex_btn.click(lambda t=ex_text: t, outputs=intent_input)
500
+
501
+ with gr.Column(scale=3):
502
+ primary_out = gr.Markdown(label="Result")
503
+ with gr.Accordion("Nearby Anchors", open=True):
504
+ alternatives_out = gr.Markdown()
505
+ wire_out = gr.Code(label="SLIP Wire Format", language=None)
506
+
507
+ quantize_btn.click(
508
+ quantize_intent,
509
+ inputs=intent_input,
510
+ outputs=[primary_out, alternatives_out, wire_out],
511
+ )
512
+ intent_input.submit(
513
+ quantize_intent,
514
+ inputs=intent_input,
515
+ outputs=[primary_out, alternatives_out, wire_out],
516
+ )
517
+
518
+ gr.Markdown("---\nBuilt with Gradio on Hugging Face Spaces.")
519
+
520
+ if __name__ == "__main__":
521
+ demo.launch()
assets/analysis.json ADDED
@@ -0,0 +1,75 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "schema_version": "paper_analysis_v1",
3
+ "pdf": {
4
+ "path": "D:\\Development\\slipcore\\private\\zenodo\\slipstream-paper.pdf",
5
+ "filename": "slipstream-paper.pdf",
6
+ "sha256": "e91b687dbbe2aa4fe01ec0ae3c5475fda9ad2a5107ea8e81927028c575c707f7",
7
+ "page_count": 7,
8
+ "text_pages_extracted": 7,
9
+ "extracted_chars": 11939
10
+ },
11
+ "paper": {
12
+ "title": "Slipstream: Semantic Quantization for Efficient Multi-Agent Coordination",
13
+ "authors": [
14
+ "Anthony Maio"
15
+ ],
16
+ "abstract": "As multi-agent LLM systems scale,coordination bandwidthbecomes a primary cost\ndriver: every token spent on routing, intent framing, and redundant context is paid repeat-\nedly across agents and turns. Current approaches waste 40–60% of compute on coordination\noverhead, with communication costs scalingO(n2)as agent counts increase.\nThis paper introducesSlipstream, a protocol that performssemantic quantization:\nmapping free-form messages onto a sharedUniversal Concept Reference (UCR)and\ntransmitting compactmnemonic anchorsthat identify structured intents. Unlike syn-\ntactic compression (which fails due to BPE tokenizer fragmentation), Slipstream transmits\nnatural-language mnemonics that tokenize efficiently across model architectures.\nSlipstream combines (1) a symbolic4D semantic manifold—Action, Polarity, Domain,\nUrgency—with (2) a data-drivenvector engine(embeddings + nearest-centroid retrieval)\nplus anevolutionary extension layerthat learns new anchors from low-confidence traf-\nfic. Results show82% token reduction(41.9→7.4 tokens average) while maintaining\nsemantic fidelity, making large-scale multi-agent deployments economically viable."
17
+ },
18
+ "artifacts": {
19
+ "urls": [
20
+ "https://github.com/anthony-maio/slipcore",
21
+ "https://modelcontextprotocol.io/,",
22
+ "https://www.linuxfoundation."
23
+ ],
24
+ "hf_models": [],
25
+ "hf_datasets": [],
26
+ "hf_spaces": [],
27
+ "possible_hf_repo_ids": [
28
+ "Edge/embedded",
29
+ "Msg/Day",
30
+ "REQ/TSK",
31
+ "messages/day",
32
+ "org/press"
33
+ ],
34
+ "github_repos": [
35
+ "anthony-maio/slipcore"
36
+ ],
37
+ "arxiv_ids": [
38
+ "1982.10564",
39
+ "2690.17728"
40
+ ],
41
+ "dois": [
42
+ "10.1109/TIT.1982.1056489",
43
+ "10.1145/1772690.1772862",
44
+ "10.18653/v1/D19-1410"
45
+ ]
46
+ },
47
+ "suggested": {
48
+ "space_slug": "slipstream-semantic-quantization-for-efficient-m",
49
+ "space_title": "Slipstream: Semantic Quantization for Efficient Multi-Agent Coordination",
50
+ "tags": [
51
+ "semantic-quantization",
52
+ "multi-agent-systems",
53
+ "protocol-standards",
54
+ "token-ef-"
55
+ ],
56
+ "emoji": "📄",
57
+ "colorFrom": "blue",
58
+ "colorTo": "indigo"
59
+ },
60
+ "outputs": {
61
+ "context_txt": "paper_context.txt",
62
+ "chunks_jsonl": "paper_chunks.jsonl",
63
+ "rendered_pages": [
64
+ {
65
+ "page": 1,
66
+ "path": "pages\\page_01.png"
67
+ },
68
+ {
69
+ "page": 2,
70
+ "path": "pages\\page_02.png"
71
+ }
72
+ ],
73
+ "extracted_images": []
74
+ }
75
+ }
paper_chunks.jsonl ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {"chunk_id": 0, "text": "Slipstream: Semantic Quantization for Efficient\nMulti-Agent Coordination\nAnthony Maio\nIndependent Researcher\nanthony@making-minds.ai\n2025\nAbstract\nAs multi-agent LLM systems scale,coordination bandwidthbecomes a primary cost\ndriver: every token spent on routing, intent framing, and redundant context is paid repeat-\nedly across agents and turns. Current approaches waste 40–60% of compute on coordination\noverhead, with communication costs scalingO(n2)as agent counts increase.\nThis paper introducesSlipstream, a protocol that performssemantic quantization:\nmapping free-form messages onto a sharedUniversal Concept Reference (UCR)and\ntransmitting compactmnemonic anchorsthat identify structured intents. Unlike syn-\ntactic compression (which fails due to BPE tokenizer fragmentation), Slipstream transmits\nnatural-language mnemonics that tokenize efficiently across model architectures.\nSlipstream combines (1) a symbolic4D semantic manifold—Action, Polarity, Domain,\nUrgency—with (2) a data-drivenvector engine(embeddings + nearest-centroid retrieval)\nplus anevolutionary extension layerthat learns new anchors from low-confidence traf-\nfic. Results show82% token reduction(41.9→7.4 tokens average) while maintaining\nsemantic fidelity, making large-scale multi-agent deployments economically viable.\nKeywords:Semantic Quantization, Multi-Agent Systems, Protocol Standards, Token Ef-\nficiency, Agentic AI\n1 Introduction\n1.1 The Coordination Crisis\nAgent swarms incur atokenizer tax: the repeated, non-semantic overhead of communicating\nmessage types, domains, and priorities. This overhead often dominates when messages are\nstructured (routing, task dispatch, acknowledgements).\nA typical coordination message:\n1{\n2\" sender \": \" planning_agent \",\n3\" recipient \": \" execution_agent \",\n4\" message_type \": \" task_delegation \",\n5\" content \": {\n6\" request \": \" Please review the authentication code \",\n7\" priority \": \" high \"\n8}\n9}\n•Token count:∼45 tokens\n•Semantic content:∼10 tokens\n•Information density:22%\n1\nAt GPT-4o pricing ($5/M input, $15/M output), a 50-agent deployment exchanging 1,000\nmessages/day costs$180,000/yearin coordination tokens alone—before any work is per-\nformed.\n1.2 Why Syntactic Compression Fails\nOur initial approach, nSLIP v1, focused on syntactic minification:\n1REQ / TSK |s =7| d =3| act = review_auth\n•Expected tokens:8–10\n•Actual tokens with BPE:18–22\nThe failure stems"}
2
+ {"chunk_id": 1, "text": "tic Compression Fails\nOur initial approach, nSLIP v1, focused on syntactic minification:\n1REQ / TSK |s =7| d =3| act = review_auth\n•Expected tokens:8–10\n•Actual tokens with BPE:18–22\nThe failure stems from Byte-Pair Encoding (BPE) tokenizer behavior. Punctuation and\nspecial characters fragment into separate tokens:\nTable 1: BPE Tokenization of Syntactic Compression\nInput Tokens\nREQ/TSK REQ,/,TSK= 3\n|s=7| |,s,=,7,|= 5\nThis “Tokenizer Tax” negates syntactic savings entirely.\n1.3 The Solution: Semantic Quantization\nInstead of compressingsyntax, we quantizesemantics. Agents share a pre-agreed “concept\ncodebook” (the UCR) and transmit pointers to meanings:\n1SLIP v1 planner executor RequestReview auth_module\nToken count:7 tokens (82% reduction)\nThe key insight:natural English words tokenize efficiently.RequestReviewis 1–2\ntokens across major tokenizers, while0x0011fragments into 3–4 tokens.\n2 The Universal Concept Reference\n2.1 The 4D Semantic Manifold\nThe UCR represents each anchor as a coordinate in a 4-dimensional semantic space:\nTable 2: UCR Semantic Dimensions\nDimension Values Purpose\nACTION request, inform, propose, evaluate Speech act type\nPOLARITY negative, neutral, positive Outcome sentiment\nDOMAIN task, plan, observation, control Context area\nURGENCY routine, elevated, critical Priority level\nThis structure provides:\n1.Interpretability:Anchors can be audited, extended, and reasoned about\n2\n2.Constraint surface:Agents can validate structural plausibility\n3.Semantic arithmetic:Combining dimensions yields predictable intents\n2.2 Anchor Structure\nEach anchor includes:\n1@dataclass\n2class UCRAnchor :\n3index : int # Unique ID (0 x0000 -0 xFFFF )\n4mnemonic : str # Wire token : \" RequestReview \"\n5canonical : str # Human description\n6coords : tuple [int , ...] # Position in manifold\n7is_core : bool # True if immutable core anchor\n•Core Range (0x0000–0x7FFF):Standard anchors, immutable per version\n•Extension Range (0x8000–0xFFFF):Installation-specific, evolvable\n2.3 Core Anchors\nTable 3: Core UCR Anchors by Category\nCategory Anchors\nRequestsRequestTask,RequestReview,RequestHelp,RequestPlan\nInformInformComplete,InformProgress,InformBlocked,InformStatus\nProposeProposePlan,ProposeChange,ProposeAlternative\nEvaluateEvalApprove,EvalReject,EvalNeedsWork\nMetaAccept,Reject,MetaAck,MetaHandoff,Fallback\n3 Protocol Specification\n3.1 Wire Format\n1SLIP v1 <src > <dst > <anchor >"}
3
+ {"chunk_id": 2, "text": "n,ProposeChange,ProposeAlternative\nEvaluateEvalApprove,EvalReject,EvalNeedsWork\nMetaAccept,Reject,MetaAck,MetaHandoff,Fallback\n3 Protocol Specification\n3.1 Wire Format\n1SLIP v1 <src > <dst > <anchor > [ payload ...]\nTable 4: Wire Format Fields\nField Description\nSLIP v1Protocol marker and version\n<src>Source agent identifier\n<dst>Destination agent identifier\n<anchor>UCR mnemonic (e.g.,RequestReview)\n[payload]Optional space-separated parameters\nDesign Principles:\n•No special characters that fragment in BPE\n•Natural English words for efficient tokenization\n•Human-readable for debugging\n•Model-agnostic (works across GPT-4, Claude, Llama, etc.)\n3\n3.2 The Think-Quantize-Transmit Pattern\nThe TQT pattern consists of three stages:\n1.THINK:Agent forms natural language intent: “Please review the authentication code\nfor security”\n2.QUANTIZE:Map to nearest UCR anchor via keyword matching (fast, zero-dependency)\nor embedding similarity (accurate, requires ML). Result:RequestReview(confidence:\n0.89)\n3.TRANSMIT:Wire format:SLIP v1 dev reviewer RequestReview auth. Tokens: 7\n(vs 45 for JSON)\n4 Vector Quantization Engine\n4.1 Embedding-Based Retrieval\nThe vector quantization engine leverages sentence embeddings [Reimers and Gurevych, 2019]\nto map natural language intents to UCR anchors. Given a messagex, the vector engine embeds\nit and retrieves the best anchor by cosine similarity:\nk∗ = argmaxk cos(E(x),ck)(1)\nWhereE(x)is the thought embedding andck is the anchor centroid. This approach extends\nclassical quantization theory [Lloyd, 1982] to the semantic domain.\nA confidence thresholdτcontrols whether to emit an anchor or fall back to plaintext:\n1def quantize ( thought : str , threshold : float = 0.55) :\n2embedding = encode ( thought )\n3similarities = cosine ( embedding , centroids )\n4best_idx = argmax ( similarities )\n5\n6if similarities [ best_idx ] < threshold :\n7return Fallback ( thought )\n8\n9return anchors [ best_idx ]\n4.2 Graceful Degradation\nThe system operates in three modes:\nTable 5: Quantization Modes\nMode Dependencies Accuracy Use Case\nFull ML sentence-transformers 94% Production\nKeyword None 78% Edge/embedded\nFallback None 100% (passthrough) Novel intents\n5 Evolutionary Extension Layer\n5.1 The Drift Problem\nStatic codebooks degrade underconcept drift—new domains, task types, and terminology\nemerge over time. A codebook trained on software development fails on biotech"}
4
+ {"chunk_id": 3, "text": "Extension Layer\n5.1 The Drift Problem\nStatic codebooks degrade underconcept drift—new domains, task types, and terminology\nemerge over time. A codebook trained on software development fails on biotech vocabulary.\n4\n5.2 Extension Learning\nSlipstream reserves the extension range (0x8000–0xFFFF) for learned anchors:\n1.Log:Messages with low quantization confidence are recorded\n2.Cluster:K-means identifies recurring semantic patterns [Sculley, 2010]\n3.Mint:New anchors are created with inferred 4D coordinates\n4.Register:Indices assigned in extension range; vector index rebuilt\n1class ExtensionManager :\n2def propose_extensions (self , fallbacks , min_cluster_size =3) :\n3embeddings = encode ( fallbacks )\n4clusters = kmeans ( embeddings , k= len ( fallbacks ) // min_cluster_size )\n5\n6new_anchors = []\n7for cluster in clusters :\n8if len ( cluster ) >= min_cluster_size :\n9centroid = mean ( embeddings [ cluster ])\n10exemplar = nearest_to_centroid ( cluster )\n11coords = infer_coords ( exemplar )\n12new_anchors . append ( mint_anchor ( centroid , exemplar , coords ))\n13\n14return new_anchors\n5.3 Governance\nExtension learning can be abused. Mitigations:\n•Minimum cluster size requirements\n•Rate limits on minting\n•Human approval gates for production\n•Provenance logging for each anchor\n6 Evaluation\n6.1 Token Efficiency\nTable 6: Token Efficiency Comparison\nMessage Type JSON Tokens SLIP Tokens Reduction\nTask delegation 47.3 8.2 82.7%\nStatus update 35.1 6.4 81.8%\nError report 52.0 9.1 82.5%\nAverage 41.9 7.4 82.3%\n5\n6.2 Cost Savings\nTable 7: Annual Cost Comparison by Deployment Scale\nScale Agents Msg/Day JSON Cost SLIP Cost Savings\nStartup 10 500 $3,600 $650 $2,950\nScale-up 50 5,000 $180,000 $32,400 $147,600\nEnterprise 1,000 500,000 $2,500,000 $450,000$2,050,000\n6.3 Semantic Fidelity\n•Retrieval accuracy:94% top-1 on intent classification\n•Coverage:88.7% of messages quantize without fallback\n•Codebook utilization:87% of anchors actively used\n7 Integration with AAIF Ecosystem\nSlipstream is designed as thetransport layerfor the Linux Foundation’s Agentic AI Founda-\ntion (AAIF) standards [Linux Foundation, 2025]:\n+-------------------------------------+\n| Application (Agent Logic) |\n+-----------------+-------------------+\n|\n+-----------------v-------------------+\n| MCP / A2A (Semantic Layer) | <- Discovery, capabilities\n+-----------------+-------------------+\n|\n+-----------------v-------"}
5
+ {"chunk_id": 4, "text": "----------+-------------------+\n|\n+-----------------v-------------------+\n| MCP / A2A (Semantic Layer) | <- Discovery, capabilities\n+-----------------+-------------------+\n|\n+-----------------v-------------------+\n| Slipstream (Transport Layer) | <- 82% token reduction\n+-----------------+-------------------+\n|\n+-----------------v-------------------+\n| Network (HTTP, WebSocket, gRPC) |\n+-------------------------------------+\nCompatibility:Works transparently beneath Model Context Protocol (MCP) [Anthropic,\n2024] and Agent2Agent (A2A), like gRPC optimizes HTTP/2.\n8 Security Considerations\nTable 8: Security Threats and Mitigations\nThreat Mitigation\nPrompt injection via payloads Validate types; treat payloads as untrusted\nAnchor poisoning Min cluster size, rate limits, human approval\nOver-compression Allow fallback to plaintext; confidence thresholds\nSemantic drift Evolutionary layer; version-locked core anchors\n6\n9 Implementation\nA reference implementation is available asslipcore:\n1pip install slipcore\n1from slipcore import slip , decode , think_quantize_transmit\n2\n3# Direct message creation\n4wire = slip (\" alice \", \" bob \", \" RequestReview \", [\" auth_module \"])\n5# -> \" SLIP v1 alice bob RequestReview auth_module \"\n6\n7# Think - Quantize - Transmit pattern\n8wire = think_quantize_transmit (\n9\" Please review the authentication code \",\n10src =\" dev \", dst =\" reviewer \"\n11)\n12# -> \" SLIP v1 dev reviewer RequestReview \"\n13\n14# Decode\n15msg = decode ( wire )\n16print ( msg . anchor . canonical ) # \" Request review of work \"\n•Repository:https://github.com/anthony-maio/slipcore\n•License:Apache 2.0\n10 Conclusion\nSlipstream demonstrates thatsemantic quantizationis the necessary evolution for high-\nthroughput agent coordination. By grounding agents in a structured 4D manifold and trans-\nmitting natural-language mnemonics, we achieve 82% token reduction without sacrificing inter-\npretability or cross-model compatibility.\nThe protocol’s evolutionary layer enables adaptation to new domains while keeping core\nsemantics stable. As agent swarms scale, the shared UCR becomes a form of “collective\nunderstanding”—reducing not just tokens, but the cognitive overhead of coordination itself.\nReferences\nAnthropic. Model context protocol specification.https://modelcontextprotocol.io/, 2024.\nAccessed: 2024.\nLinux Foundation. Agentic AI foundation announcement.https://www.linuxfoundation."}
6
+ {"chunk_id": 5, "text": "f.\nReferences\nAnthropic. Model context protocol specification.https://modelcontextprotocol.io/, 2024.\nAccessed: 2024.\nLinux Foundation. Agentic AI foundation announcement.https://www.linuxfoundation.\norg/press/agentic-ai-foundation, 2025. Accessed: 2025.\nStuart Lloyd. Least squares quantization in PCM.IEEE Transactions on Information Theory,\n28(2):129–137, 1982. doi: 10.1109/TIT.1982.1056489.\nNils Reimers and Iryna Gurevych. Sentence-BERT: Sentence embeddings using siamese BERT-\nnetworks. InProceedings of the 2019 Conference on Empirical Methods in Natural Lan-\nguage Processing and the 9th International Joint Conference on Natural Language Processing\n(EMNLP-IJCNLP), pages 3982–3992. Association for Computational Linguistics, 2019. doi:\n10.18653/v1/D19-1410.\nD. Sculley. Web-scale k-means clustering. InProceedings of the 19th International Conference\non World Wide Web, pages 1177–1178. ACM, 2010. doi: 10.1145/1772690.1772862.\n7"}
paper_context.txt ADDED
@@ -0,0 +1,269 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Slipstream: Semantic Quantization for Efficient
2
+ Multi-Agent Coordination
3
+ Anthony Maio
4
+ Independent Researcher
5
+ anthony@making-minds.ai
6
+ 2025
7
+ Abstract
8
+ As multi-agent LLM systems scale,coordination bandwidthbecomes a primary cost
9
+ driver: every token spent on routing, intent framing, and redundant context is paid repeat-
10
+ edly across agents and turns. Current approaches waste 40–60% of compute on coordination
11
+ overhead, with communication costs scalingO(n2)as agent counts increase.
12
+ This paper introducesSlipstream, a protocol that performssemantic quantization:
13
+ mapping free-form messages onto a sharedUniversal Concept Reference (UCR)and
14
+ transmitting compactmnemonic anchorsthat identify structured intents. Unlike syn-
15
+ tactic compression (which fails due to BPE tokenizer fragmentation), Slipstream transmits
16
+ natural-language mnemonics that tokenize efficiently across model architectures.
17
+ Slipstream combines (1) a symbolic4D semantic manifold—Action, Polarity, Domain,
18
+ Urgency—with (2) a data-drivenvector engine(embeddings + nearest-centroid retrieval)
19
+ plus anevolutionary extension layerthat learns new anchors from low-confidence traf-
20
+ fic. Results show82% token reduction(41.9→7.4 tokens average) while maintaining
21
+ semantic fidelity, making large-scale multi-agent deployments economically viable.
22
+ Keywords:Semantic Quantization, Multi-Agent Systems, Protocol Standards, Token Ef-
23
+ ficiency, Agentic AI
24
+ 1 Introduction
25
+ 1.1 The Coordination Crisis
26
+ Agent swarms incur atokenizer tax: the repeated, non-semantic overhead of communicating
27
+ message types, domains, and priorities. This overhead often dominates when messages are
28
+ structured (routing, task dispatch, acknowledgements).
29
+ A typical coordination message:
30
+ 1{
31
+ 2" sender ": " planning_agent ",
32
+ 3" recipient ": " execution_agent ",
33
+ 4" message_type ": " task_delegation ",
34
+ 5" content ": {
35
+ 6" request ": " Please review the authentication code ",
36
+ 7" priority ": " high "
37
+ 8}
38
+ 9}
39
+ •Token count:∼45 tokens
40
+ •Semantic content:∼10 tokens
41
+ •Information density:22%
42
+ 1
43
+ At GPT-4o pricing ($5/M input, $15/M output), a 50-agent deployment exchanging 1,000
44
+ messages/day costs$180,000/yearin coordination tokens alone—before any work is per-
45
+ formed.
46
+ 1.2 Why Syntactic Compression Fails
47
+ Our initial approach, nSLIP v1, focused on syntactic minification:
48
+ 1REQ / TSK |s =7| d =3| act = review_auth
49
+ •Expected tokens:8–10
50
+ •Actual tokens with BPE:18–22
51
+ The failure stems from Byte-Pair Encoding (BPE) tokenizer behavior. Punctuation and
52
+ special characters fragment into separate tokens:
53
+ Table 1: BPE Tokenization of Syntactic Compression
54
+ Input Tokens
55
+ REQ/TSK REQ,/,TSK= 3
56
+ |s=7| |,s,=,7,|= 5
57
+ This “Tokenizer Tax” negates syntactic savings entirely.
58
+ 1.3 The Solution: Semantic Quantization
59
+ Instead of compressingsyntax, we quantizesemantics. Agents share a pre-agreed “concept
60
+ codebook” (the UCR) and transmit pointers to meanings:
61
+ 1SLIP v1 planner executor RequestReview auth_module
62
+ Token count:7 tokens (82% reduction)
63
+ The key insight:natural English words tokenize efficiently.RequestReviewis 1–2
64
+ tokens across major tokenizers, while0x0011fragments into 3–4 tokens.
65
+ 2 The Universal Concept Reference
66
+ 2.1 The 4D Semantic Manifold
67
+ The UCR represents each anchor as a coordinate in a 4-dimensional semantic space:
68
+ Table 2: UCR Semantic Dimensions
69
+ Dimension Values Purpose
70
+ ACTION request, inform, propose, evaluate Speech act type
71
+ POLARITY negative, neutral, positive Outcome sentiment
72
+ DOMAIN task, plan, observation, control Context area
73
+ URGENCY routine, elevated, critical Priority level
74
+ This structure provides:
75
+ 1.Interpretability:Anchors can be audited, extended, and reasoned about
76
+ 2
77
+ 2.Constraint surface:Agents can validate structural plausibility
78
+ 3.Semantic arithmetic:Combining dimensions yields predictable intents
79
+ 2.2 Anchor Structure
80
+ Each anchor includes:
81
+ 1@dataclass
82
+ 2class UCRAnchor :
83
+ 3index : int # Unique ID (0 x0000 -0 xFFFF )
84
+ 4mnemonic : str # Wire token : " RequestReview "
85
+ 5canonical : str # Human description
86
+ 6coords : tuple [int , ...] # Position in manifold
87
+ 7is_core : bool # True if immutable core anchor
88
+ •Core Range (0x0000–0x7FFF):Standard anchors, immutable per version
89
+ •Extension Range (0x8000–0xFFFF):Installation-specific, evolvable
90
+ 2.3 Core Anchors
91
+ Table 3: Core UCR Anchors by Category
92
+ Category Anchors
93
+ RequestsRequestTask,RequestReview,RequestHelp,RequestPlan
94
+ InformInformComplete,InformProgress,InformBlocked,InformStatus
95
+ ProposeProposePlan,ProposeChange,ProposeAlternative
96
+ EvaluateEvalApprove,EvalReject,EvalNeedsWork
97
+ MetaAccept,Reject,MetaAck,MetaHandoff,Fallback
98
+ 3 Protocol Specification
99
+ 3.1 Wire Format
100
+ 1SLIP v1 <src > <dst > <anchor > [ payload ...]
101
+ Table 4: Wire Format Fields
102
+ Field Description
103
+ SLIP v1Protocol marker and version
104
+ <src>Source agent identifier
105
+ <dst>Destination agent identifier
106
+ <anchor>UCR mnemonic (e.g.,RequestReview)
107
+ [payload]Optional space-separated parameters
108
+ Design Principles:
109
+ •No special characters that fragment in BPE
110
+ •Natural English words for efficient tokenization
111
+ •Human-readable for debugging
112
+ •Model-agnostic (works across GPT-4, Claude, Llama, etc.)
113
+ 3
114
+ 3.2 The Think-Quantize-Transmit Pattern
115
+ The TQT pattern consists of three stages:
116
+ 1.THINK:Agent forms natural language intent: “Please review the authentication code
117
+ for security”
118
+ 2.QUANTIZE:Map to nearest UCR anchor via keyword matching (fast, zero-dependency)
119
+ or embedding similarity (accurate, requires ML). Result:RequestReview(confidence:
120
+ 0.89)
121
+ 3.TRANSMIT:Wire format:SLIP v1 dev reviewer RequestReview auth. Tokens: 7
122
+ (vs 45 for JSON)
123
+ 4 Vector Quantization Engine
124
+ 4.1 Embedding-Based Retrieval
125
+ The vector quantization engine leverages sentence embeddings [Reimers and Gurevych, 2019]
126
+ to map natural language intents to UCR anchors. Given a messagex, the vector engine embeds
127
+ it and retrieves the best anchor by cosine similarity:
128
+ k∗ = argmaxk cos(E(x),ck)(1)
129
+ WhereE(x)is the thought embedding andck is the anchor centroid. This approach extends
130
+ classical quantization theory [Lloyd, 1982] to the semantic domain.
131
+ A confidence thresholdτcontrols whether to emit an anchor or fall back to plaintext:
132
+ 1def quantize ( thought : str , threshold : float = 0.55) :
133
+ 2embedding = encode ( thought )
134
+ 3similarities = cosine ( embedding , centroids )
135
+ 4best_idx = argmax ( similarities )
136
+ 5
137
+ 6if similarities [ best_idx ] < threshold :
138
+ 7return Fallback ( thought )
139
+ 8
140
+ 9return anchors [ best_idx ]
141
+ 4.2 Graceful Degradation
142
+ The system operates in three modes:
143
+ Table 5: Quantization Modes
144
+ Mode Dependencies Accuracy Use Case
145
+ Full ML sentence-transformers 94% Production
146
+ Keyword None 78% Edge/embedded
147
+ Fallback None 100% (passthrough) Novel intents
148
+ 5 Evolutionary Extension Layer
149
+ 5.1 The Drift Problem
150
+ Static codebooks degrade underconcept drift—new domains, task types, and terminology
151
+ emerge over time. A codebook trained on software development fails on biotech vocabulary.
152
+ 4
153
+ 5.2 Extension Learning
154
+ Slipstream reserves the extension range (0x8000–0xFFFF) for learned anchors:
155
+ 1.Log:Messages with low quantization confidence are recorded
156
+ 2.Cluster:K-means identifies recurring semantic patterns [Sculley, 2010]
157
+ 3.Mint:New anchors are created with inferred 4D coordinates
158
+ 4.Register:Indices assigned in extension range; vector index rebuilt
159
+ 1class ExtensionManager :
160
+ 2def propose_extensions (self , fallbacks , min_cluster_size =3) :
161
+ 3embeddings = encode ( fallbacks )
162
+ 4clusters = kmeans ( embeddings , k= len ( fallbacks ) // min_cluster_size )
163
+ 5
164
+ 6new_anchors = []
165
+ 7for cluster in clusters :
166
+ 8if len ( cluster ) >= min_cluster_size :
167
+ 9centroid = mean ( embeddings [ cluster ])
168
+ 10exemplar = nearest_to_centroid ( cluster )
169
+ 11coords = infer_coords ( exemplar )
170
+ 12new_anchors . append ( mint_anchor ( centroid , exemplar , coords ))
171
+ 13
172
+ 14return new_anchors
173
+ 5.3 Governance
174
+ Extension learning can be abused. Mitigations:
175
+ •Minimum cluster size requirements
176
+ •Rate limits on minting
177
+ •Human approval gates for production
178
+ •Provenance logging for each anchor
179
+ 6 Evaluation
180
+ 6.1 Token Efficiency
181
+ Table 6: Token Efficiency Comparison
182
+ Message Type JSON Tokens SLIP Tokens Reduction
183
+ Task delegation 47.3 8.2 82.7%
184
+ Status update 35.1 6.4 81.8%
185
+ Error report 52.0 9.1 82.5%
186
+ Average 41.9 7.4 82.3%
187
+ 5
188
+ 6.2 Cost Savings
189
+ Table 7: Annual Cost Comparison by Deployment Scale
190
+ Scale Agents Msg/Day JSON Cost SLIP Cost Savings
191
+ Startup 10 500 $3,600 $650 $2,950
192
+ Scale-up 50 5,000 $180,000 $32,400 $147,600
193
+ Enterprise 1,000 500,000 $2,500,000 $450,000$2,050,000
194
+ 6.3 Semantic Fidelity
195
+ •Retrieval accuracy:94% top-1 on intent classification
196
+ •Coverage:88.7% of messages quantize without fallback
197
+ •Codebook utilization:87% of anchors actively used
198
+ 7 Integration with AAIF Ecosystem
199
+ Slipstream is designed as thetransport layerfor the Linux Foundation’s Agentic AI Founda-
200
+ tion (AAIF) standards [Linux Foundation, 2025]:
201
+ +-------------------------------------+
202
+ | Application (Agent Logic) |
203
+ +-----------------+-------------------+
204
+ |
205
+ +-----------------v-------------------+
206
+ | MCP / A2A (Semantic Layer) | <- Discovery, capabilities
207
+ +-----------------+-------------------+
208
+ |
209
+ +-----------------v-------------------+
210
+ | Slipstream (Transport Layer) | <- 82% token reduction
211
+ +-----------------+-------------------+
212
+ |
213
+ +-----------------v-------------------+
214
+ | Network (HTTP, WebSocket, gRPC) |
215
+ +-------------------------------------+
216
+ Compatibility:Works transparently beneath Model Context Protocol (MCP) [Anthropic,
217
+ 2024] and Agent2Agent (A2A), like gRPC optimizes HTTP/2.
218
+ 8 Security Considerations
219
+ Table 8: Security Threats and Mitigations
220
+ Threat Mitigation
221
+ Prompt injection via payloads Validate types; treat payloads as untrusted
222
+ Anchor poisoning Min cluster size, rate limits, human approval
223
+ Over-compression Allow fallback to plaintext; confidence thresholds
224
+ Semantic drift Evolutionary layer; version-locked core anchors
225
+ 6
226
+ 9 Implementation
227
+ A reference implementation is available asslipcore:
228
+ 1pip install slipcore
229
+ 1from slipcore import slip , decode , think_quantize_transmit
230
+ 2
231
+ 3# Direct message creation
232
+ 4wire = slip (" alice ", " bob ", " RequestReview ", [" auth_module "])
233
+ 5# -> " SLIP v1 alice bob RequestReview auth_module "
234
+ 6
235
+ 7# Think - Quantize - Transmit pattern
236
+ 8wire = think_quantize_transmit (
237
+ 9" Please review the authentication code ",
238
+ 10src =" dev ", dst =" reviewer "
239
+ 11)
240
+ 12# -> " SLIP v1 dev reviewer RequestReview "
241
+ 13
242
+ 14# Decode
243
+ 15msg = decode ( wire )
244
+ 16print ( msg . anchor . canonical ) # " Request review of work "
245
+ •Repository:https://github.com/anthony-maio/slipcore
246
+ •License:Apache 2.0
247
+ 10 Conclusion
248
+ Slipstream demonstrates thatsemantic quantizationis the necessary evolution for high-
249
+ throughput agent coordination. By grounding agents in a structured 4D manifold and trans-
250
+ mitting natural-language mnemonics, we achieve 82% token reduction without sacrificing inter-
251
+ pretability or cross-model compatibility.
252
+ The protocol’s evolutionary layer enables adaptation to new domains while keeping core
253
+ semantics stable. As agent swarms scale, the shared UCR becomes a form of “collective
254
+ understanding”—reducing not just tokens, but the cognitive overhead of coordination itself.
255
+ References
256
+ Anthropic. Model context protocol specification.https://modelcontextprotocol.io/, 2024.
257
+ Accessed: 2024.
258
+ Linux Foundation. Agentic AI foundation announcement.https://www.linuxfoundation.
259
+ org/press/agentic-ai-foundation, 2025. Accessed: 2025.
260
+ Stuart Lloyd. Least squares quantization in PCM.IEEE Transactions on Information Theory,
261
+ 28(2):129–137, 1982. doi: 10.1109/TIT.1982.1056489.
262
+ Nils Reimers and Iryna Gurevych. Sentence-BERT: Sentence embeddings using siamese BERT-
263
+ networks. InProceedings of the 2019 Conference on Empirical Methods in Natural Lan-
264
+ guage Processing and the 9th International Joint Conference on Natural Language Processing
265
+ (EMNLP-IJCNLP), pages 3982–3992. Association for Computational Linguistics, 2019. doi:
266
+ 10.18653/v1/D19-1410.
267
+ D. Sculley. Web-scale k-means clustering. InProceedings of the 19th International Conference
268
+ on World Wide Web, pages 1177–1178. ACM, 2010. doi: 10.1145/1772690.1772862.
269
+ 7
quantizer.py ADDED
@@ -0,0 +1,818 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Semantic Quantizer - The Think-Quantize-Transmit Engine
3
+
4
+ Maps agent thoughts (natural language) to UCR anchors.
5
+ Supports three modes:
6
+ 1. Keyword-based (fast, no dependencies)
7
+ 2. Embedding-based with centroids (accurate, requires sentence-transformers)
8
+ 3. Hybrid with CoordsInferer (prototype similarity + heuristics)
9
+
10
+ Also handles:
11
+ - Fallback detection (when confidence is too low)
12
+ - Usage tracking (for UCR evolution)
13
+ - Coordinate inference for new anchors
14
+ """
15
+
16
+ from __future__ import annotations
17
+ from dataclasses import dataclass, field
18
+ from typing import Optional, Callable, Dict, List, Tuple
19
+ from collections import Counter
20
+ import re
21
+
22
+ try:
23
+ from .ucr import UCR, UCRAnchor, get_default_ucr, CORE_RANGE_END
24
+ except ImportError:
25
+ from ucr import UCR, UCRAnchor, get_default_ucr, CORE_RANGE_END
26
+
27
+
28
+ # ============ Optional Dependencies ============
29
+
30
+ try:
31
+ import numpy as np
32
+ HAS_NUMPY = True
33
+ except ImportError:
34
+ np = None # type: ignore
35
+ HAS_NUMPY = False
36
+
37
+
38
+ # ============ Semantic Coordinates ============
39
+
40
+ @dataclass(frozen=True)
41
+ class SemanticCoords:
42
+ """
43
+ 4D logical position of an intent in the semantic manifold.
44
+
45
+ This is a human-readable representation used during coordinate inference.
46
+ Maps to UCR's numeric coords via ACTION_MAP, DOMAIN_MAP, etc.
47
+ """
48
+ action: str # REQ, INF, EVAL, CMD, OBS, PROP, META
49
+ polarity: int # -1 (negative), 0 (neutral), 1 (positive)
50
+ domain: str # TASK, QA, INFRA, AUTH, ERR, DOC, META, GEN
51
+ urgency: int # 0 (routine) to 3 (critical)
52
+
53
+
54
+ # Map v3-style string coords to v2 numeric coords
55
+ ACTION_MAP = {"OBS": 0, "INF": 1, "ASK": 2, "REQ": 3, "PROP": 4, "COMMIT": 5, "EVAL": 6, "META": 7, "CMD": 3}
56
+ DOMAIN_MAP = {"TASK": 0, "PLAN": 1, "OBS": 2, "EVAL": 3, "CTRL": 4, "RES": 5, "ERR": 6, "GEN": 7,
57
+ "QA": 3, "INFRA": 5, "AUTH": 4, "DOC": 1, "META": 4}
58
+
59
+
60
+ def semantic_coords_to_tuple(sc: SemanticCoords) -> tuple[int, ...]:
61
+ """Convert SemanticCoords to UCR numeric tuple."""
62
+ action = ACTION_MAP.get(sc.action.upper(), 3) # default REQ
63
+ domain = DOMAIN_MAP.get(sc.domain.upper(), 7) # default GEN
64
+ # Map polarity: -1->1, 0->4, 1->6
65
+ polarity = {-1: 1, 0: 4, 1: 6}.get(sc.polarity, 4)
66
+ # Map urgency 0-3 to 0-7: 0->1, 1->3, 2->5, 3->7
67
+ urgency = min(7, max(0, sc.urgency * 2 + 1))
68
+ return (action, polarity, domain, urgency)
69
+
70
+
71
+ # ============ Coordinate Inference (from v3) ============
72
+
73
+ class CoordsInferer:
74
+ """
75
+ Assigns (Action, Polarity, Domain, Urgency) to text.
76
+
77
+ Hybrid approach:
78
+ - Heuristics for urgency and polarity (reliable, fast)
79
+ - Optional prototype embedding similarity for action/domain refinement
80
+
81
+ LIMITATIONS AND CAVEATS:
82
+ ========================
83
+
84
+ 1. HEURISTIC FRAGILITY:
85
+ - Keyword matching is context-blind ("please" could be polite or pleading)
86
+ - Urgency detection relies on explicit markers ("ASAP", "urgent")
87
+ - Domain detection depends on domain-specific vocabulary
88
+
89
+ 2. PROTOTYPE EMBEDDING ISSUES:
90
+ - Limited prototype phrases (3 per action, 2 per domain)
91
+ - English-only prototypes; other languages will use fallback
92
+ - Short messages may not have enough signal for reliable similarity
93
+
94
+ 3. KNOWN FAILURE MODES:
95
+ - Sarcasm/irony: "Great, another bug" -> incorrectly infers positive polarity
96
+ - Questions phrased as statements: "I wonder if..." -> misses ASK action
97
+ - Multi-intent: "Review and deploy this" -> only captures first action
98
+ - Implicit urgency: "CEO is waiting" -> misses critical urgency
99
+
100
+ 4. RECOMMENDED USAGE:
101
+ - Use as initialization for finetuned models, not production truth
102
+ - Always validate inferred coords against anchor centroids
103
+ - Track fallback rate; >15% indicates poor UCR coverage
104
+ - For production: finetune a small model on CoordsInferer output
105
+
106
+ 5. ACCURACY ESTIMATES (informal testing):
107
+ - Urgency: ~80% (explicit markers work well)
108
+ - Polarity: ~70% (sentiment is hard)
109
+ - Action: ~65% with embeddings, ~55% heuristic-only
110
+ - Domain: ~60% (highly vocabulary-dependent)
111
+
112
+ Ported from v3's coordinate inference system.
113
+ """
114
+
115
+ def __init__(self, embed_batch: Optional[Callable] = None):
116
+ self._embed_batch = embed_batch
117
+ self._proto_action: Dict[str, "np.ndarray"] = {}
118
+ self._proto_domain: Dict[str, "np.ndarray"] = {}
119
+
120
+ # Prototype phrases (short sentences > single tokens for embeddings)
121
+ self._action_phrases = {
122
+ "REQ": [
123
+ "Please do this task.",
124
+ "Can you help with this request?",
125
+ "I need you to do something.",
126
+ ],
127
+ "INF": [
128
+ "FYI, here is a status update.",
129
+ "I finished the task.",
130
+ "This is an informational update.",
131
+ ],
132
+ "EVAL": [
133
+ "Please review and evaluate this.",
134
+ "Assess the quality of this work.",
135
+ "Give a critique of this design.",
136
+ ],
137
+ "CMD": [
138
+ "Do this immediately.",
139
+ "Execute this command.",
140
+ "Run the operation now.",
141
+ ],
142
+ "OBS": [
143
+ "I noticed something changed.",
144
+ "The current state is...",
145
+ "I observed an issue.",
146
+ ],
147
+ "PROP": [
148
+ "I suggest we do this.",
149
+ "Here's my proposal.",
150
+ "We could try this approach.",
151
+ ],
152
+ }
153
+ self._domain_phrases = {
154
+ "TASK": ["Assign a task ticket.", "Work item status update."],
155
+ "QA": ["Request code review.", "Review pull request."],
156
+ "INFRA": ["Scale the Kubernetes cluster.", "Deploy infrastructure change."],
157
+ "AUTH": ["OAuth login issue.", "Authentication and authorization."],
158
+ "ERR": ["System error occurred.", "Critical failure and outage."],
159
+ "DOC": ["Update documentation.", "Write technical docs."],
160
+ "META": ["Discuss process and coordination.", "Team protocol and planning."],
161
+ "GEN": ["General conversation.", "Generic request or update."],
162
+ }
163
+
164
+ def prime(self) -> None:
165
+ """Compute prototype embeddings (if embedder available)."""
166
+ if not self._embed_batch or not HAS_NUMPY:
167
+ return
168
+
169
+ def _norm(vec):
170
+ n = np.linalg.norm(vec)
171
+ return vec / (n + 1e-12) if n > 0 else vec
172
+
173
+ # Actions
174
+ action_labels = list(self._action_phrases.keys())
175
+ action_texts = [" ".join(self._action_phrases[a]) for a in action_labels]
176
+ action_vecs = self._embed_batch(action_texts)
177
+ for a, v in zip(action_labels, action_vecs):
178
+ self._proto_action[a] = _norm(np.asarray(v, dtype=np.float32))
179
+
180
+ # Domains
181
+ domain_labels = list(self._domain_phrases.keys())
182
+ domain_texts = [" ".join(self._domain_phrases[d]) for d in domain_labels]
183
+ domain_vecs = self._embed_batch(domain_texts)
184
+ for d, v in zip(domain_labels, domain_vecs):
185
+ self._proto_domain[d] = _norm(np.asarray(v, dtype=np.float32))
186
+
187
+ def infer(self, text: str, vec: Optional["np.ndarray"] = None) -> SemanticCoords:
188
+ """
189
+ Infer semantic coordinates from text.
190
+
191
+ Args:
192
+ text: The input text to analyze
193
+ vec: Optional pre-computed embedding vector for refinement
194
+
195
+ Returns:
196
+ SemanticCoords with inferred action, polarity, domain, urgency
197
+ """
198
+ t = text.strip()
199
+ low = t.lower()
200
+
201
+ # --- urgency (heuristic) ---
202
+ urgency = 0
203
+ if any(k in low for k in ("critical", "sev1", "sev-1", "p0", "immediately", "right now")):
204
+ urgency = 3
205
+ elif any(k in low for k in ("urgent", "asap", "high priority", "blocker")):
206
+ urgency = 2
207
+ elif any(k in low for k in ("soon", "priority", "important")):
208
+ urgency = 1
209
+
210
+ # --- polarity (heuristic) ---
211
+ polarity = 0
212
+ if any(k in low for k in ("error", "failed", "failure", "crash", "broken", "outage", "bug", "can't", "cannot")):
213
+ polarity = -1
214
+ elif any(k in low for k in ("fixed", "resolved", "success", "completed", "done", "working now", "all good")):
215
+ polarity = 1
216
+
217
+ # --- action (heuristic) ---
218
+ action = "INF"
219
+ if t.endswith("?") or low.startswith(("can you", "could you", "would you", "please")):
220
+ action = "REQ"
221
+ if any(k in low for k in ("review", "critique", "evaluate", "assess")):
222
+ action = "EVAL"
223
+ if any(k in low for k in ("do this", "run ", "execute", "deploy", "scale ", "restart")) and urgency >= 2:
224
+ action = "CMD"
225
+ if any(k in low for k in ("i noticed", "i see", "detected", "observed")):
226
+ action = "OBS"
227
+ if any(k in low for k in ("i suggest", "i propose", "we could", "how about")):
228
+ action = "PROP"
229
+
230
+ # --- domain (heuristic) ---
231
+ domain = "GEN"
232
+ if any(k in low for k in ("kubernetes", "k8s", "cluster", "deploy", "terraform", "docker", "infra", "server", "latency")):
233
+ domain = "INFRA"
234
+ elif any(k in low for k in ("auth", "oauth", "login", "jwt", "sso", "permission")):
235
+ domain = "AUTH"
236
+ elif any(k in low for k in ("review", "pull request", "pr ", "qa", "test")):
237
+ domain = "QA"
238
+ elif any(k in low for k in ("task", "ticket", "jira", "backlog")):
239
+ domain = "TASK"
240
+ elif any(k in low for k in ("error", "exception", "stacktrace", "failed", "failure", "outage")):
241
+ domain = "ERR"
242
+ elif any(k in low for k in ("doc", "documentation", "readme", "spec", "paper")):
243
+ domain = "DOC"
244
+ elif any(k in low for k in ("protocol", "manifold", "coordination", "orchestrator")):
245
+ domain = "META"
246
+
247
+ # Optional refinement via prototype similarity
248
+ if vec is not None and HAS_NUMPY and self._proto_action and self._proto_domain:
249
+ def _norm(v):
250
+ n = np.linalg.norm(v)
251
+ return v / (n + 1e-12) if n > 0 else v
252
+
253
+ v = _norm(vec.astype(np.float32, copy=False))
254
+
255
+ # Action refine
256
+ a_best, a_score = action, -1.0
257
+ for a, pv in self._proto_action.items():
258
+ s = float(np.dot(v, pv))
259
+ if s > a_score:
260
+ a_best, a_score = a, s
261
+ if a_score >= 0.40:
262
+ action = a_best
263
+
264
+ # Domain refine
265
+ d_best, d_score = domain, -1.0
266
+ for d, pv in self._proto_domain.items():
267
+ s = float(np.dot(v, pv))
268
+ if s > d_score:
269
+ d_best, d_score = d, s
270
+ if d_score >= 0.35:
271
+ domain = d_best
272
+
273
+ # If action is REQ and urgency not set, default to 1
274
+ if action == "REQ" and urgency == 0:
275
+ urgency = 1
276
+
277
+ return SemanticCoords(action=action, polarity=polarity, domain=domain, urgency=urgency)
278
+
279
+
280
+ # Global coords inferer instance
281
+ _coords_inferer: Optional[CoordsInferer] = None
282
+
283
+
284
+ def get_coords_inferer() -> CoordsInferer:
285
+ """Get or create the default CoordsInferer."""
286
+ global _coords_inferer
287
+ if _coords_inferer is None:
288
+ _coords_inferer = CoordsInferer()
289
+ return _coords_inferer
290
+
291
+
292
+ def infer_coords(text: str, vec: Optional["np.ndarray"] = None) -> tuple[int, ...]:
293
+ """
294
+ Infer UCR-compatible coordinates from text.
295
+
296
+ Returns a tuple of 4 integers suitable for UCRAnchor.coords.
297
+ """
298
+ inferer = get_coords_inferer()
299
+ sc = inferer.infer(text, vec)
300
+ return semantic_coords_to_tuple(sc)
301
+
302
+
303
+ # ============ Quantization Result ============
304
+
305
+ @dataclass
306
+ class QuantizeResult:
307
+ """
308
+ Result of quantizing a thought to a UCR anchor.
309
+
310
+ Attributes:
311
+ anchor: The matched UCR anchor
312
+ confidence: How well the thought matches (0.0-1.0)
313
+ method: How the match was made ("keyword", "embedding", "fallback")
314
+ alternatives: Other possible matches with their scores
315
+ """
316
+ anchor: UCRAnchor
317
+ confidence: float
318
+ method: str
319
+ alternatives: list[tuple[UCRAnchor, float]] = field(default_factory=list)
320
+
321
+ @property
322
+ def is_fallback(self) -> bool:
323
+ return self.anchor.mnemonic == "Fallback"
324
+
325
+ @property
326
+ def is_high_confidence(self) -> bool:
327
+ return self.confidence >= 0.7
328
+
329
+
330
+ # ============ Keyword-Based Quantizer ============
331
+
332
+ # Keyword patterns for each anchor category
333
+ _KEYWORD_PATTERNS: dict[str, list[str]] = {
334
+ # Observations
335
+ "ObserveState": ["state", "current", "status", "environment", "system state"],
336
+ "ObserveChange": ["changed", "detected", "noticed", "updated", "modified"],
337
+ "ObserveError": ["error", "exception", "failed", "crash", "bug"],
338
+
339
+ # Information
340
+ "InformResult": ["result", "output", "computed", "calculated", "returns"],
341
+ "InformStatus": ["status", "update", "progress", "currently"],
342
+ "InformComplete": ["complete", "finished", "done", "completed", "success"],
343
+ "InformBlocked": ["blocked", "waiting", "stuck", "depends on", "need"],
344
+ "InformProgress": ["progress", "working on", "making progress", "underway"],
345
+
346
+ # Questions
347
+ "AskClarify": ["clarify", "what do you mean", "unclear", "confused", "explain"],
348
+ "AskStatus": ["what is the status", "how is", "progress on", "update on"],
349
+ "AskPermission": ["can i", "may i", "permission", "allowed", "okay to"],
350
+ "AskResource": ["available", "resource", "capacity", "do we have"],
351
+
352
+ # Requests
353
+ "RequestTask": ["please do", "execute", "perform", "run", "implement"],
354
+ "RequestPlan": ["create a plan", "plan for", "how should we", "strategy"],
355
+ "RequestReview": ["review", "check", "look at", "evaluate", "feedback"],
356
+ "RequestHelp": ["help", "assist", "support", "guidance", "advice"],
357
+ "RequestCancel": ["cancel", "abort", "stop", "nevermind", "forget"],
358
+ "RequestPriority": ["priority", "urgent", "expedite", "escalate"],
359
+ "RequestResource": ["allocate", "provision", "need resource", "require"],
360
+
361
+ # Proposals
362
+ "ProposePlan": ["propose", "suggest", "recommendation", "i think we should"],
363
+ "ProposeChange": ["change", "modify", "alter", "adjust"],
364
+ "ProposeAlternative": ["alternative", "instead", "another approach", "option"],
365
+ "ProposeRollback": ["rollback", "revert", "undo", "go back"],
366
+
367
+ # Commitments
368
+ "CommitTask": ["i will", "i'll do", "on it", "taking this", "i commit"],
369
+ "CommitDeadline": ["by", "deadline", "eta", "deliver by"],
370
+ "CommitResource": ["allocating", "providing", "assigning"],
371
+
372
+ # Evaluations
373
+ "EvalApprove": ["approved", "lgtm", "looks good", "accept", "ship it"],
374
+ "EvalReject": ["rejected", "no", "denied", "not acceptable", "wrong"],
375
+ "EvalNeedsWork": ["needs work", "revise", "changes needed", "almost"],
376
+ "EvalComplete": ["complete", "done", "finished", "all good"],
377
+ "EvalBlocked": ["blocked", "cannot proceed", "impediment"],
378
+
379
+ # Meta
380
+ "MetaAck": ["ack", "acknowledged", "got it", "received", "understood"],
381
+ "MetaSync": ["sync", "ping", "alive", "heartbeat"],
382
+ "MetaHandoff": ["handoff", "transfer", "passing to", "your turn"],
383
+ "MetaEscalate": ["escalate", "raise", "need manager", "above my paygrade"],
384
+ "MetaAbort": ["abort", "emergency stop", "halt", "critical failure"],
385
+
386
+ # Accept/Reject
387
+ "Accept": ["yes", "accept", "agreed", "confirmed", "affirmative"],
388
+ "Reject": ["no", "reject", "disagree", "refuse", "decline"],
389
+ "AcceptWithCondition": ["yes but", "if", "conditional", "provided that"],
390
+ "Defer": ["later", "defer", "postpone", "not now", "revisit"],
391
+
392
+ # Errors
393
+ "ErrorGeneric": ["error", "failed", "exception"],
394
+ "ErrorTimeout": ["timeout", "timed out", "too slow"],
395
+ "ErrorResource": ["resource unavailable", "out of", "exhausted"],
396
+ "ErrorPermission": ["permission denied", "unauthorized", "forbidden"],
397
+ "ErrorValidation": ["invalid", "validation failed", "bad input"],
398
+ }
399
+
400
+
401
+ def _keyword_score(thought: str, patterns: list[str]) -> float:
402
+ """Score how well a thought matches keyword patterns."""
403
+ thought_lower = thought.lower()
404
+ matches = 0
405
+ for pattern in patterns:
406
+ if pattern.lower() in thought_lower:
407
+ # Longer patterns are stronger signals
408
+ matches += len(pattern.split())
409
+ # Normalize to 0-1 range (cap at 1.0)
410
+ return min(1.0, matches / 3.0)
411
+
412
+
413
+ class KeywordQuantizer:
414
+ """
415
+ Simple keyword-based quantizer. No ML dependencies.
416
+ Good for bootstrapping and low-latency scenarios.
417
+ """
418
+
419
+ def __init__(self, ucr: Optional[UCR] = None, fallback_threshold: float = 0.2):
420
+ self.ucr = ucr or get_default_ucr()
421
+ self.fallback_threshold = fallback_threshold
422
+ self._usage_stats: Counter = Counter()
423
+
424
+ def quantize(self, thought: str) -> QuantizeResult:
425
+ """
426
+ Map a natural language thought to the best UCR anchor.
427
+
428
+ Args:
429
+ thought: The agent's thought/intent in natural language
430
+
431
+ Returns:
432
+ QuantizeResult with the best anchor and confidence score
433
+ """
434
+ scores: list[tuple[UCRAnchor, float]] = []
435
+
436
+ for mnemonic, patterns in _KEYWORD_PATTERNS.items():
437
+ anchor = self.ucr.get_by_mnemonic(mnemonic)
438
+ if anchor:
439
+ score = _keyword_score(thought, patterns)
440
+ if score > 0:
441
+ scores.append((anchor, score))
442
+
443
+ # Sort by score descending
444
+ scores.sort(key=lambda x: x[1], reverse=True)
445
+
446
+ if not scores or scores[0][1] < self.fallback_threshold:
447
+ # Use fallback
448
+ fallback = self.ucr.get_by_mnemonic("Fallback")
449
+ self._usage_stats["_fallback"] += 1
450
+ return QuantizeResult(
451
+ anchor=fallback,
452
+ confidence=0.0,
453
+ method="fallback",
454
+ alternatives=scores[:3],
455
+ )
456
+
457
+ best_anchor, best_score = scores[0]
458
+ self._usage_stats[best_anchor.mnemonic] += 1
459
+
460
+ return QuantizeResult(
461
+ anchor=best_anchor,
462
+ confidence=best_score,
463
+ method="keyword",
464
+ alternatives=scores[1:4],
465
+ )
466
+
467
+ def get_usage_stats(self) -> dict[str, int]:
468
+ """Get usage statistics for UCR evolution analysis."""
469
+ return dict(self._usage_stats)
470
+
471
+ def get_fallback_rate(self) -> float:
472
+ """Get the rate of fallback usage (indicates UCR coverage gaps)."""
473
+ total = sum(self._usage_stats.values())
474
+ if total == 0:
475
+ return 0.0
476
+ return self._usage_stats["_fallback"] / total
477
+
478
+
479
+ # ============ Embedding-Based Quantizer (Enhanced) ============
480
+
481
+ class EmbeddingQuantizer:
482
+ """
483
+ Embedding-based quantizer using sentence-transformers.
484
+
485
+ Enhanced with v3's centroid matrix approach:
486
+ - Pre-computes normalized centroid matrix for fast similarity search
487
+ - Supports anchor centroids (when available) or on-the-fly embedding
488
+ - Includes CoordsInferer for prototype refinement
489
+
490
+ Usage:
491
+ quantizer = EmbeddingQuantizer()
492
+ result = quantizer.quantize("I need someone to review this code")
493
+ """
494
+
495
+ def __init__(
496
+ self,
497
+ ucr: Optional[UCR] = None,
498
+ model_name: str = "all-MiniLM-L6-v2",
499
+ fallback_threshold: float = 0.55,
500
+ ):
501
+ if not HAS_NUMPY:
502
+ raise ImportError("numpy is required for EmbeddingQuantizer")
503
+
504
+ self.ucr = ucr or get_default_ucr()
505
+ self.fallback_threshold = fallback_threshold
506
+ self._usage_stats: Counter = Counter()
507
+ self._fallback_buffer: List[str] = [] # Track low-confidence messages
508
+
509
+ # Lazy load sentence-transformers
510
+ self._model = None
511
+ self._model_name = model_name
512
+
513
+ # Centroid matrix (normalized) for fast similarity
514
+ self._centroids_matrix: Optional["np.ndarray"] = None
515
+ self._anchor_indices: List[int] = [] # Maps matrix row to anchor index
516
+ self._embed_dim: Optional[int] = None
517
+
518
+ # Coords inferer with prototype refinement
519
+ self._coords_inferer: Optional[CoordsInferer] = None
520
+
521
+ def _ensure_model(self):
522
+ """Lazy load the embedding model and build centroid matrix."""
523
+ if self._model is not None:
524
+ return
525
+
526
+ try:
527
+ from sentence_transformers import SentenceTransformer
528
+ except ImportError:
529
+ raise ImportError(
530
+ "sentence-transformers is required for EmbeddingQuantizer. "
531
+ "Install with: pip install sentence-transformers"
532
+ )
533
+
534
+ self._model = SentenceTransformer(self._model_name)
535
+ self._rebuild_index()
536
+
537
+ # Initialize coords inferer with embeddings
538
+ self._coords_inferer = CoordsInferer(embed_batch=self._embed_batch)
539
+ self._coords_inferer.prime()
540
+
541
+ def _embed_batch(self, texts: List[str]) -> "np.ndarray":
542
+ """Embed a batch of texts and return normalized vectors."""
543
+ if not self._model:
544
+ self._ensure_model()
545
+ vecs = self._model.encode(texts, convert_to_numpy=True)
546
+ vecs = np.asarray(vecs, dtype=np.float32)
547
+ self._embed_dim = vecs.shape[1]
548
+ # Normalize rows
549
+ norms = np.linalg.norm(vecs, axis=1, keepdims=True)
550
+ return vecs / (norms + 1e-12)
551
+
552
+ def _embed_one(self, text: str) -> "np.ndarray":
553
+ """Embed a single text and return normalized vector."""
554
+ return self._embed_batch([text])[0]
555
+
556
+ def _rebuild_index(self):
557
+ """Build/rebuild the centroid matrix from UCR anchors."""
558
+ if not self.ucr.anchors:
559
+ self._anchor_indices = []
560
+ self._centroids_matrix = None
561
+ return
562
+
563
+ self._anchor_indices = sorted(self.ucr.anchors.keys())
564
+ anchors = [self.ucr.anchors[idx] for idx in self._anchor_indices]
565
+
566
+ # Check if anchors have pre-computed centroids
567
+ has_centroids = all(a.centroid is not None for a in anchors)
568
+
569
+ if has_centroids:
570
+ # Use pre-computed centroids
571
+ mat = np.asarray([a.centroid for a in anchors], dtype=np.float32)
572
+ else:
573
+ # Compute centroids from canonical texts
574
+ canonical_texts = [a.canonical for a in anchors]
575
+ mat = self._embed_batch(canonical_texts)
576
+ # Optionally store centroids back to anchors
577
+ for anchor, vec in zip(anchors, mat):
578
+ anchor.centroid = vec.tolist()
579
+
580
+ # Normalize rows
581
+ norms = np.linalg.norm(mat, axis=1, keepdims=True)
582
+ self._centroids_matrix = mat / (norms + 1e-12)
583
+
584
+ def quantize(self, thought: str) -> QuantizeResult:
585
+ """
586
+ Map a natural language thought to the best UCR anchor using embeddings.
587
+
588
+ Uses normalized cosine similarity against centroid matrix for fast lookup.
589
+
590
+ Args:
591
+ thought: The agent's thought/intent in natural language
592
+
593
+ Returns:
594
+ QuantizeResult with the best anchor and confidence score
595
+ """
596
+ self._ensure_model()
597
+
598
+ if self._centroids_matrix is None or len(self._anchor_indices) == 0:
599
+ fallback = self.ucr.get_by_mnemonic("Fallback")
600
+ self._fallback_buffer.append(thought)
601
+ return QuantizeResult(
602
+ anchor=fallback,
603
+ confidence=0.0,
604
+ method="fallback",
605
+ alternatives=[],
606
+ )
607
+
608
+ # Embed the thought (normalized)
609
+ thought_vec = self._embed_one(thought)
610
+
611
+ # Compute cosine similarities (dot product of normalized vectors)
612
+ similarities = np.dot(self._centroids_matrix, thought_vec)
613
+
614
+ # Get top matches
615
+ top_locs = np.argsort(similarities)[::-1][:5]
616
+ scores = []
617
+ for loc in top_locs:
618
+ anchor_idx = self._anchor_indices[loc]
619
+ anchor = self.ucr.anchors[anchor_idx]
620
+ scores.append((anchor, float(similarities[loc])))
621
+
622
+ best_anchor, best_score = scores[0]
623
+
624
+ if best_score < self.fallback_threshold:
625
+ fallback = self.ucr.get_by_mnemonic("Fallback")
626
+ self._usage_stats["_fallback"] += 1
627
+ self._fallback_buffer.append(thought)
628
+ return QuantizeResult(
629
+ anchor=fallback,
630
+ confidence=best_score,
631
+ method="fallback",
632
+ alternatives=scores[:3],
633
+ )
634
+
635
+ self._usage_stats[best_anchor.mnemonic] += 1
636
+
637
+ return QuantizeResult(
638
+ anchor=best_anchor,
639
+ confidence=best_score,
640
+ method="embedding",
641
+ alternatives=scores[1:4],
642
+ )
643
+
644
+ def get_fallback_buffer(self) -> List[str]:
645
+ """Get the buffer of low-confidence messages for extension learning."""
646
+ return self._fallback_buffer.copy()
647
+
648
+ def clear_fallback_buffer(self) -> None:
649
+ """Clear the fallback buffer after extension learning."""
650
+ self._fallback_buffer.clear()
651
+
652
+ def infer_coords_for_text(self, text: str) -> tuple[int, ...]:
653
+ """
654
+ Infer semantic coordinates for a text using the enhanced CoordsInferer.
655
+
656
+ Returns UCR-compatible coordinate tuple.
657
+ """
658
+ self._ensure_model()
659
+ vec = self._embed_one(text)
660
+ if self._coords_inferer:
661
+ sc = self._coords_inferer.infer(text, vec)
662
+ return semantic_coords_to_tuple(sc)
663
+ return infer_coords(text, vec)
664
+
665
+ def compute_centroid(self, texts: List[str]) -> List[float]:
666
+ """
667
+ Compute the centroid embedding for a cluster of texts.
668
+
669
+ Useful for creating new extension anchors.
670
+ """
671
+ self._ensure_model()
672
+ embeds = self._embed_batch(texts)
673
+ centroid = np.mean(embeds, axis=0)
674
+ # Normalize
675
+ norm = np.linalg.norm(centroid)
676
+ if norm > 0:
677
+ centroid = centroid / norm
678
+ return centroid.tolist()
679
+
680
+ def get_usage_stats(self) -> dict[str, int]:
681
+ """Get usage statistics for UCR evolution analysis."""
682
+ return dict(self._usage_stats)
683
+
684
+ def get_fallback_rate(self) -> float:
685
+ """Get the rate of fallback usage."""
686
+ total = sum(self._usage_stats.values())
687
+ if total == 0:
688
+ return 0.0
689
+ return self._usage_stats["_fallback"] / total
690
+
691
+
692
+ # ============ Auto-selecting Quantizer ============
693
+
694
+ def create_quantizer(
695
+ ucr: Optional[UCR] = None,
696
+ prefer_embeddings: bool = True,
697
+ fallback_threshold: float = 0.3,
698
+ ) -> KeywordQuantizer | EmbeddingQuantizer:
699
+ """
700
+ Create the best available quantizer.
701
+
702
+ Args:
703
+ ucr: UCR instance to use
704
+ prefer_embeddings: Try to use embedding quantizer if available
705
+ fallback_threshold: Confidence threshold for fallback
706
+
707
+ Returns:
708
+ EmbeddingQuantizer if available and preferred, else KeywordQuantizer
709
+ """
710
+ if prefer_embeddings:
711
+ try:
712
+ return EmbeddingQuantizer(ucr=ucr, fallback_threshold=fallback_threshold)
713
+ except ImportError:
714
+ pass
715
+
716
+ return KeywordQuantizer(ucr=ucr, fallback_threshold=fallback_threshold)
717
+
718
+
719
+ # ============ High-Level API ============
720
+
721
+ _default_quantizer: Optional[KeywordQuantizer | EmbeddingQuantizer] = None
722
+
723
+
724
+ def quantize(thought: str) -> QuantizeResult:
725
+ """
726
+ Quantize a thought to a UCR anchor using the default quantizer.
727
+
728
+ This is the main entry point for the Think-Quantize-Transmit pattern.
729
+
730
+ Example:
731
+ >>> result = quantize("Please review the authentication code")
732
+ >>> result.anchor.mnemonic
733
+ 'RequestReview'
734
+ >>> result.confidence
735
+ 0.67
736
+ """
737
+ global _default_quantizer
738
+ if _default_quantizer is None:
739
+ _default_quantizer = create_quantizer(prefer_embeddings=False)
740
+ return _default_quantizer.quantize(thought)
741
+
742
+
743
+ def think_quantize_transmit(
744
+ thought: str,
745
+ src: str,
746
+ dst: str,
747
+ ucr: Optional[UCR] = None,
748
+ ) -> str:
749
+ """
750
+ The complete Think-Quantize-Transmit flow.
751
+
752
+ Takes a natural language thought and produces a wire-ready SLIP message.
753
+
754
+ Args:
755
+ thought: Natural language intent
756
+ src: Source agent identifier
757
+ dst: Destination agent identifier
758
+ ucr: Optional UCR instance
759
+
760
+ Returns:
761
+ Wire-format SLIP message string
762
+
763
+ Example:
764
+ >>> wire = think_quantize_transmit(
765
+ ... "I need someone to check this code for security issues",
766
+ ... src="developer",
767
+ ... dst="reviewer"
768
+ ... )
769
+ >>> wire
770
+ 'SLIP v1 developer reviewer RequestReview'
771
+ """
772
+ try:
773
+ from .protocol import slip, fallback as slip_fallback
774
+ except ImportError:
775
+ # Inline simple slip format when protocol not available
776
+ def slip(src, dst, mnemonic, ucr=None):
777
+ return f"SLIP v1 {src} {dst} {mnemonic}"
778
+ def slip_fallback(src, dst, payload, ucr=None):
779
+ return f'SLIP v1 {src} {dst} Fallback "{payload}"'
780
+
781
+ result = quantize(thought)
782
+
783
+ if result.is_fallback:
784
+ return slip_fallback(src, dst, thought, ucr)
785
+ else:
786
+ return slip(src, dst, result.anchor.mnemonic, ucr=ucr)
787
+
788
+
789
+ # ============ Smoke Test ============
790
+
791
+ if __name__ == "__main__":
792
+ print("=== Semantic Quantizer Demo ===\n")
793
+
794
+ test_thoughts = [
795
+ "Please review the authentication module for security issues",
796
+ "I've finished implementing the feature",
797
+ "What's the current status of the deployment?",
798
+ "I propose we use Redis for caching instead of Memcached",
799
+ "Yes, that looks good to me",
800
+ "There's an error in the payment processing code",
801
+ "I'm blocked waiting for the API credentials",
802
+ "Check the auth logs for timing anomalies in the OAuth flow", # Should fallback
803
+ ]
804
+
805
+ quantizer = KeywordQuantizer()
806
+
807
+ for thought in test_thoughts:
808
+ result = quantizer.quantize(thought)
809
+ status = "FALLBACK" if result.is_fallback else f"{result.confidence:.2f}"
810
+ print(f"Thought: {thought[:50]}...")
811
+ print(f" → {result.anchor.mnemonic} ({status})")
812
+ if result.alternatives:
813
+ alt_str = ", ".join(f"{a.mnemonic}:{s:.2f}" for a, s in result.alternatives[:2])
814
+ print(f" Alternatives: {alt_str}")
815
+ print()
816
+
817
+ print(f"Fallback rate: {quantizer.get_fallback_rate():.1%}")
818
+ print(f"\nUsage stats: {quantizer.get_usage_stats()}")
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ gradio
2
+ huggingface_hub[inference]
3
+ sentence-transformers
4
+ numpy
ucr.py ADDED
@@ -0,0 +1,311 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Universal Concept Reference (UCR) - The Semantic Manifold
3
+
4
+ The UCR is a quantized semantic coordinate system for agent communication.
5
+ Instead of transmitting embeddings (model-specific, high-dimensional), agents
6
+ communicate via positions in a shared, low-dimensional semantic manifold.
7
+
8
+ Core concepts:
9
+ - Dimensions: Semantic axes (action, urgency, domain, polarity)
10
+ - Anchors: Named positions in the manifold (common agent intents)
11
+ - Quantization: Map agent thoughts to nearest anchor
12
+
13
+ Architecture:
14
+ - Core UCR (0x0000-0x7FFF): Standard anchors, immutable per version
15
+ - Extension UCR (0x8000-0xFFFF): Installation-specific, evolvable
16
+ """
17
+
18
+ from __future__ import annotations
19
+ from dataclasses import dataclass, field
20
+ from enum import IntEnum
21
+ from typing import Optional
22
+ import json
23
+ from pathlib import Path
24
+
25
+
26
+ # ============ Semantic Dimensions ============
27
+ # The axes of our semantic manifold. Kept minimal per design.
28
+
29
+ class Dimension(IntEnum):
30
+ """
31
+ The semantic axes of the UCR manifold.
32
+ Each dimension represents a fundamental aspect of agent communication.
33
+ """
34
+ ACTION = 0 # What type of action: observe, inform, request, propose, evaluate
35
+ POLARITY = 1 # Direction: initiating vs responding, positive vs negative
36
+ DOMAIN = 2 # Context: task, plan, observation, evaluation, control
37
+ URGENCY = 3 # Priority: routine, elevated, critical
38
+
39
+
40
+ # Discrete levels per dimension (kept small for token efficiency)
41
+ LEVELS_PER_DIM = 8
42
+
43
+
44
+ # ============ UCR Entry (Anchor) ============
45
+
46
+ @dataclass
47
+ class UCRAnchor:
48
+ """
49
+ A named position in the semantic manifold.
50
+
51
+ Attributes:
52
+ index: Unique identifier (0x0000-0xFFFF)
53
+ mnemonic: Single-token wire representation (e.g., "RequestReview")
54
+ canonical: Human-readable description
55
+ coords: Position in the manifold (one value per dimension)
56
+ is_core: True if part of standard UCR, False if extension
57
+ centroid: Optional embedding vector for ML-based quantization
58
+ """
59
+ index: int
60
+ mnemonic: str
61
+ canonical: str
62
+ coords: tuple[int, ...] # One int per dimension, each 0 to LEVELS_PER_DIM-1
63
+ is_core: bool = True
64
+ centroid: Optional[list[float]] = None # Embedding vector for ML quantization
65
+
66
+ def __post_init__(self):
67
+ if len(self.coords) != len(Dimension):
68
+ raise ValueError(f"coords must have {len(Dimension)} values, got {len(self.coords)}")
69
+ for i, c in enumerate(self.coords):
70
+ if not (0 <= c < LEVELS_PER_DIM):
71
+ raise ValueError(f"coord[{i}] must be 0-{LEVELS_PER_DIM-1}, got {c}")
72
+
73
+ def to_dict(self) -> dict:
74
+ result = {
75
+ "index": self.index,
76
+ "mnemonic": self.mnemonic,
77
+ "canonical": self.canonical,
78
+ "coords": list(self.coords),
79
+ "is_core": self.is_core,
80
+ }
81
+ if self.centroid is not None:
82
+ result["centroid"] = self.centroid
83
+ return result
84
+
85
+ @classmethod
86
+ def from_dict(cls, d: dict) -> "UCRAnchor":
87
+ return cls(
88
+ index=d["index"],
89
+ mnemonic=d["mnemonic"],
90
+ canonical=d["canonical"],
91
+ coords=tuple(d["coords"]),
92
+ is_core=d.get("is_core", True),
93
+ centroid=d.get("centroid"),
94
+ )
95
+
96
+
97
+ # ============ UCR Registry ============
98
+
99
+ CORE_RANGE_END = 0x8000 # 0x0000-0x7FFF = core, 0x8000-0xFFFF = extensions
100
+
101
+
102
+ @dataclass
103
+ class UCR:
104
+ """
105
+ The Universal Concept Reference - a semantic manifold for agent communication.
106
+
107
+ Contains both core (standard) anchors and extension (local) anchors.
108
+ Provides lookup by index, mnemonic, and nearest-neighbor by coordinates.
109
+ """
110
+ version: str
111
+ anchors: dict[int, UCRAnchor] = field(default_factory=dict)
112
+ _mnemonic_index: dict[str, int] = field(default_factory=dict, repr=False)
113
+
114
+ def add_anchor(self, anchor: UCRAnchor) -> None:
115
+ """Add an anchor to the registry."""
116
+ if anchor.index in self.anchors:
117
+ raise ValueError(f"Anchor index {anchor.index:#06x} already exists")
118
+ if anchor.mnemonic in self._mnemonic_index:
119
+ raise ValueError(f"Anchor mnemonic '{anchor.mnemonic}' already exists")
120
+
121
+ # Validate core vs extension range
122
+ if anchor.is_core and anchor.index >= CORE_RANGE_END:
123
+ raise ValueError(f"Core anchor index must be < {CORE_RANGE_END:#06x}")
124
+ if not anchor.is_core and anchor.index < CORE_RANGE_END:
125
+ raise ValueError(f"Extension anchor index must be >= {CORE_RANGE_END:#06x}")
126
+
127
+ self.anchors[anchor.index] = anchor
128
+ self._mnemonic_index[anchor.mnemonic] = anchor.index
129
+
130
+ def get_by_index(self, index: int) -> Optional[UCRAnchor]:
131
+ """Lookup anchor by numeric index."""
132
+ return self.anchors.get(index)
133
+
134
+ def get_by_mnemonic(self, mnemonic: str) -> Optional[UCRAnchor]:
135
+ """Lookup anchor by mnemonic string."""
136
+ idx = self._mnemonic_index.get(mnemonic)
137
+ return self.anchors.get(idx) if idx is not None else None
138
+
139
+ def find_nearest(self, coords: tuple[int, ...]) -> UCRAnchor:
140
+ """
141
+ Find the anchor nearest to the given coordinates.
142
+ Uses Manhattan distance for simplicity and speed.
143
+ """
144
+ if not self.anchors:
145
+ raise ValueError("UCR has no anchors")
146
+
147
+ best_anchor = None
148
+ best_distance = float('inf')
149
+
150
+ for anchor in self.anchors.values():
151
+ distance = sum(abs(a - b) for a, b in zip(anchor.coords, coords))
152
+ if distance < best_distance:
153
+ best_distance = distance
154
+ best_anchor = anchor
155
+
156
+ return best_anchor
157
+
158
+ def core_anchors(self) -> list[UCRAnchor]:
159
+ """Return all core (standard) anchors."""
160
+ return [a for a in self.anchors.values() if a.is_core]
161
+
162
+ def extension_anchors(self) -> list[UCRAnchor]:
163
+ """Return all extension (local) anchors."""
164
+ return [a for a in self.anchors.values() if not a.is_core]
165
+
166
+ def next_extension_index(self) -> int:
167
+ """Get the next available extension index."""
168
+ ext_indices = [a.index for a in self.anchors.values() if not a.is_core]
169
+ if not ext_indices:
170
+ return CORE_RANGE_END
171
+ return max(ext_indices) + 1
172
+
173
+ def save(self, path: Path) -> None:
174
+ """Save UCR to JSON file."""
175
+ data = {
176
+ "version": self.version,
177
+ "anchors": [a.to_dict() for a in self.anchors.values()],
178
+ }
179
+ with open(path, "w", encoding="utf-8") as f:
180
+ json.dump(data, f, indent=2)
181
+
182
+ @classmethod
183
+ def load(cls, path: Path) -> "UCR":
184
+ """Load UCR from JSON file."""
185
+ with open(path, "r", encoding="utf-8") as f:
186
+ data = json.load(f)
187
+
188
+ ucr = cls(version=data["version"])
189
+ for anchor_data in data["anchors"]:
190
+ ucr.add_anchor(UCRAnchor.from_dict(anchor_data))
191
+ return ucr
192
+
193
+ def __len__(self) -> int:
194
+ return len(self.anchors)
195
+
196
+
197
+ # ============ Base UCR Factory ============
198
+
199
+ def create_base_ucr() -> UCR:
200
+ """
201
+ Create the base UCR with core anchors for common agent intents.
202
+
203
+ Coordinate scheme (4 dimensions, 8 levels each):
204
+ - ACTION: 0=observe, 1=inform, 2=ask, 3=request, 4=propose, 5=commit, 6=evaluate, 7=meta
205
+ - POLARITY: 0=negative, 1-3=declining, 4=neutral, 5-6=positive, 7=strong_positive
206
+ - DOMAIN: 0=task, 1=plan, 2=observation, 3=evaluation, 4=control, 5=resource, 6=error, 7=general
207
+ - URGENCY: 0=background, 1-2=low, 3-4=normal, 5-6=elevated, 7=critical
208
+ """
209
+ ucr = UCR(version="1.0.0")
210
+
211
+ # Core coordination anchors
212
+ # Format: (index, mnemonic, canonical, (action, polarity, domain, urgency))
213
+
214
+ core_anchors = [
215
+ # === Observations (ACTION=0) ===
216
+ (0x0001, "ObserveState", "Report current system or environment state", (0, 4, 2, 3)),
217
+ (0x0002, "ObserveChange", "Report a detected change", (0, 4, 2, 4)),
218
+ (0x0003, "ObserveError", "Report an observed error condition", (0, 2, 6, 6)),
219
+
220
+ # === Information (ACTION=1) ===
221
+ (0x0010, "InformResult", "Share a computed or derived result", (1, 5, 2, 3)),
222
+ (0x0011, "InformStatus", "Provide status update", (1, 4, 0, 3)),
223
+ (0x0012, "InformComplete", "Report task completion", (1, 6, 0, 4)),
224
+ (0x0013, "InformBlocked", "Report being blocked on something", (1, 2, 0, 5)),
225
+ (0x0014, "InformProgress", "Share progress update", (1, 5, 0, 3)),
226
+
227
+ # === Questions (ACTION=2) ===
228
+ (0x0020, "AskClarify", "Request clarification on requirements", (2, 4, 1, 4)),
229
+ (0x0021, "AskStatus", "Query current status", (2, 4, 0, 3)),
230
+ (0x0022, "AskPermission", "Request permission to proceed", (2, 4, 4, 4)),
231
+ (0x0023, "AskResource", "Query resource availability", (2, 4, 5, 3)),
232
+
233
+ # === Requests (ACTION=3) ===
234
+ (0x0030, "RequestTask", "Request execution of a task", (3, 4, 0, 4)),
235
+ (0x0031, "RequestPlan", "Request creation of a plan", (3, 4, 1, 4)),
236
+ (0x0032, "RequestReview", "Request review of work", (3, 4, 3, 3)),
237
+ (0x0033, "RequestHelp", "Request assistance", (3, 4, 7, 5)),
238
+ (0x0034, "RequestCancel", "Request cancellation", (3, 1, 4, 5)),
239
+ (0x0035, "RequestPriority", "Request priority change", (3, 4, 4, 5)),
240
+ (0x0036, "RequestResource", "Request allocation of resource", (3, 4, 5, 4)),
241
+
242
+ # === Proposals (ACTION=4) ===
243
+ (0x0040, "ProposePlan", "Propose a plan for consideration", (4, 5, 1, 4)),
244
+ (0x0041, "ProposeChange", "Propose a modification", (4, 5, 0, 4)),
245
+ (0x0042, "ProposeAlternative", "Propose an alternative approach", (4, 5, 1, 4)),
246
+ (0x0043, "ProposeRollback", "Propose reverting changes", (4, 3, 4, 5)),
247
+
248
+ # === Commitments (ACTION=5) ===
249
+ (0x0050, "CommitTask", "Commit to performing a task", (5, 6, 0, 4)),
250
+ (0x0051, "CommitDeadline", "Commit to a deadline", (5, 6, 0, 4)),
251
+ (0x0052, "CommitResource", "Commit resources", (5, 6, 5, 4)),
252
+
253
+ # === Evaluations (ACTION=6) ===
254
+ (0x0060, "EvalApprove", "Evaluation: approved/positive", (6, 7, 3, 4)),
255
+ (0x0061, "EvalReject", "Evaluation: rejected/negative", (6, 0, 3, 4)),
256
+ (0x0062, "EvalNeedsWork", "Evaluation: needs revision", (6, 3, 3, 4)),
257
+ (0x0063, "EvalComplete", "Evaluation: work is complete", (6, 6, 3, 4)),
258
+ (0x0064, "EvalBlocked", "Evaluation: blocked by issue", (6, 2, 3, 5)),
259
+
260
+ # === Meta/Control (ACTION=7) ===
261
+ (0x0070, "MetaAck", "Acknowledge receipt", (7, 5, 4, 2)),
262
+ (0x0071, "MetaSync", "Synchronization ping", (7, 4, 4, 3)),
263
+ (0x0072, "MetaHandoff", "Hand off responsibility", (7, 4, 4, 4)),
264
+ (0x0073, "MetaEscalate", "Escalate to higher authority", (7, 3, 4, 6)),
265
+ (0x0074, "MetaAbort", "Abort current operation", (7, 0, 4, 7)),
266
+
267
+ # === Accept/Reject responses ===
268
+ (0x0080, "Accept", "Accept a proposal or request", (5, 7, 7, 3)),
269
+ (0x0081, "Reject", "Reject a proposal or request", (5, 0, 7, 3)),
270
+ (0x0082, "AcceptWithCondition", "Conditional acceptance", (5, 5, 7, 4)),
271
+ (0x0083, "Defer", "Defer decision", (5, 4, 7, 2)),
272
+
273
+ # === Error handling ===
274
+ (0x0090, "ErrorGeneric", "Generic error occurred", (1, 1, 6, 5)),
275
+ (0x0091, "ErrorTimeout", "Operation timed out", (1, 1, 6, 5)),
276
+ (0x0092, "ErrorResource", "Resource unavailable", (1, 1, 6, 5)),
277
+ (0x0093, "ErrorPermission", "Permission denied", (1, 0, 6, 5)),
278
+ (0x0094, "ErrorValidation", "Validation failed", (1, 1, 6, 4)),
279
+
280
+ # === Fallback ===
281
+ (0x00FF, "Fallback", "Unquantizable - see payload for natural language", (7, 4, 7, 4)),
282
+ ]
283
+
284
+ for index, mnemonic, canonical, coords in core_anchors:
285
+ ucr.add_anchor(UCRAnchor(
286
+ index=index,
287
+ mnemonic=mnemonic,
288
+ canonical=canonical,
289
+ coords=coords,
290
+ is_core=True,
291
+ ))
292
+
293
+ return ucr
294
+
295
+
296
+ # Module-level default UCR instance
297
+ _default_ucr: Optional[UCR] = None
298
+
299
+
300
+ def get_default_ucr() -> UCR:
301
+ """Get or create the default UCR instance."""
302
+ global _default_ucr
303
+ if _default_ucr is None:
304
+ _default_ucr = create_base_ucr()
305
+ return _default_ucr
306
+
307
+
308
+ def set_default_ucr(ucr: UCR) -> None:
309
+ """Set the default UCR instance."""
310
+ global _default_ucr
311
+ _default_ucr = ucr