Spaces:

HyperlinksSpace
/

TinyModel1Space

Sleeping

App Files Files Community

anriltine commited on 12 days ago

Commit

5406f45

verified ·

1 Parent(s): 32907e2

Deploy TinyModel1Space from GitHub Actions

Browse files

Files changed (4) hide show

README.md +12 -1
scripts/google_cse_client.py +88 -0
scripts/nl_controls.py +1281 -0
scripts/universal_brain_chat.py +112 -8

README.md CHANGED Viewed

@@ -11,7 +11,17 @@ pinned: false
 # TinyModel1Space
-**Universal Brain chat** — natural-language routing, generative replies, FAQ RAG, SQLite memory, and TinyModel classification. Uses encoder **`HyperlinksSpace/TinyModel1`** (see also [model card](https://huggingface.co/HyperlinksSpace/TinyModel1)).
 - **App URL:** [https://hyperlinksspace-tinymodel1space.hf.space](https://hyperlinksspace-tinymodel1space.hf.space)
 - **Source:** [https://github.com/HyperlinksSpace/TinyModel](https://github.com/HyperlinksSpace/TinyModel)
@@ -27,6 +37,7 @@ pinned: false
 - `GOOGLE_CSE_CX` — Programmable Search Engine id (`cx` from the [PSE control panel](https://programmablesearchengine.google.com/controlpanel/all)); required together with `GOOGLE_CSE_API_KEY` for web search.
 - `GOOGLE_CSE_NUM` — results per request, 1–10 (default 5).
 - `GOOGLE_CSE_SAFE` — optional safe-search level for Google `cse.list` (e.g. `off`, `active`).
 ### Note

 # TinyModel1Space
+**Universal Brain chat** — natural-language routing, a **small generative instruct model** (default in code: **SmolLM2-360M-Instruct**, override with **`HORIZON2_MODEL`**), **TinyModel1** encoder (**HyperlinksSpace/TinyModel1**), **FAQ hybrid RAG**, **SQLite memory**, optional **Google web search**, and **embedded prompt signals** (one-turn reply shape from long natural wording — see in-app **Testing embedded prompt signals**). See the [model card](https://huggingface.co/HyperlinksSpace/TinyModel1) for classifier weights.
+| Capability | Notes |
+| --- | --- |
+| **Generation** | Chat + composed answers from the causal instruct LM (quality/size depends on **`HORIZON2_MODEL`** and CPU/GPU). |
+| **Encoder** | 4-way **AG News–style** labels; **embeddings** for FAQ retrieval, **`/similarity`**, **`/nearest`**. |
+| **FAQ RAG** | Bundled corpus; toggle injection from chat; grounding phrases in `/help`. |
+| **Memory** | Scoped SQLite notes; use *Start a new private session* / *Switch to scope …* on shared demos. |
+| **Routing** | JSON router for summarize / reformulate / grounded / retrieve / **web_search** / classify / similarity / embedding / nearest / memory / status. |
+| **Web** | Requires **`GOOGLE_CSE_API_KEY`** + **`GOOGLE_CSE_CX`**; optional **auto-web** (disable with **`NO_AUTO_WEB`**). |
+| **NL + embedded** | Short **session control** phrases + long-message **`prompt_signals:`** style overlays (see Gradio copy under the chat). |
 - **App URL:** [https://hyperlinksspace-tinymodel1space.hf.space](https://hyperlinksspace-tinymodel1space.hf.space)
 - **Source:** [https://github.com/HyperlinksSpace/TinyModel](https://github.com/HyperlinksSpace/TinyModel)
 - `GOOGLE_CSE_CX` — Programmable Search Engine id (`cx` from the [PSE control panel](https://programmablesearchengine.google.com/controlpanel/all)); required together with `GOOGLE_CSE_API_KEY` for web search.
 - `GOOGLE_CSE_NUM` — results per request, 1–10 (default 5).
 - `GOOGLE_CSE_SAFE` — optional safe-search level for Google `cse.list` (e.g. `off`, `active`).
+- `NO_AUTO_WEB` — set to `1` / `true` / `on` to disable **automatic** chat→web upgrades (router + `/web` unchanged).
 ### Note

scripts/google_cse_client.py CHANGED Viewed

@@ -11,6 +11,7 @@ from __future__ import annotations
 import json
 import os
 import urllib.error
 import urllib.parse
 import urllib.request
@@ -114,3 +115,90 @@ def format_cse_hits_markdown(hits: list[CSEHit], *, for_chat: bool) -> str:
             f"- **Snippet:** {h.snippet}\n"
         )
     return "\n".join(lines).strip()

 import json
 import os
+import re
 import urllib.error
 import urllib.parse
 import urllib.request
             f"- **Snippet:** {h.snippet}\n"
         )
     return "\n".join(lines).strip()
+def heuristic_suggests_web_search(msg: str) -> bool:
+    """True if ``msg`` likely needs live web results (used when the router returns ``chat``).
+    Conservative: skips code-like text, slash commands, short lines, and in-app / FAQ phrasing.
+    """
+    m = (msg or "").strip().lower()
+    if len(m) < 12:
+        return False
+    if m.startswith("/"):
+        return False
+    if "```" in m or m.startswith("def "):
+        return False
+    if any(
+        x in m
+        for x in (
+            "/retrieve",
+            "faq excerpt",
+            "this space",
+            "this app",
+            "your refund",
+            "your policy",
+            "your shipping",
+            "your terms",
+            "according to your faq",
+            "in your documentation",
+        )
+    ):
+        return False
+    phrases = (
+        "latest ",
+        "breaking ",
+        "breaking news",
+        " as of ",
+        "right now",
+        "today ",
+        "today's",
+        "tonight",
+        "yesterday",
+        "this week",
+        "this month",
+        "current president",
+        "current ceo",
+        "current prime minister",
+        "who won ",
+        "who won the",
+        "election results",
+        "stock price",
+        "share price",
+        "market cap",
+        "exchange rate",
+        "crypto price",
+        "weather in",
+        "forecast for",
+        "when is the next",
+        "still alive",
+        "world cup",
+        "olympics",
+        "super bowl",
+        "score of",
+        "official announcement",
+        "press release",
+        "release date",
+        "when did ",
+        "when does ",
+        "google ",
+        "search online",
+        "look up online",
+        "on wikipedia",
+        "according to the news",
+        "news about",
+        "headlines",
+        "rumor is",
+        "rumour is",
+        "is it true that",
+        "fact check",
+        "verify online",
+    )
+    if any(p in m for p in phrases):
+        return True
+    if re.search(r"\b20(2[4-9]|[3-9][0-9])\b", m) and re.search(
+        r"\b(who|what|when|where|why|how|did|does|do|is|are|was|were|will|has|have)\b", m
+    ):
+        return True
+    return False

scripts/nl_controls.py CHANGED Viewed

	@@ -650,3 +650,1284 @@ def parse_control_action(message: str) -> ControlAction \| None:
650
651	return None
652

     return None
+# Tokens for "write the reply in …" detection (allowlist avoids "in Python" / "in 24 hours").
+_REPLY_LANG_TOKENS: dict[str, str] = {
+    "arabic": "Arabic",
+    "chinese": "Chinese (Simplified)",
+    "czech": "Czech",
+    "danish": "Danish",
+    "dutch": "Dutch",
+    "english": "English",
+    "finnish": "Finnish",
+    "french": "French",
+    "german": "German",
+    "greek": "Greek",
+    "hebrew": "Hebrew",
+    "hindi": "Hindi",
+    "hungarian": "Hungarian",
+    "indonesian": "Indonesian",
+    "italian": "Italian",
+    "japanese": "Japanese",
+    "korean": "Korean",
+    "norwegian": "Norwegian",
+    "polish": "Polish",
+    "portuguese": "Portuguese",
+    "romanian": "Romanian",
+    "russian": "Russian",
+    "spanish": "Spanish",
+    "swedish": "Swedish",
+    "thai": "Thai",
+    "turkish": "Turkish",
+    "ukrainian": "Ukrainian",
+    "vietnamese": "Vietnamese",
+}
+_LEN_CAP_UNIT_MAX: dict[str, int] = {
+    "word": 2500,
+    "words": 2500,
+    "sentence": 80,
+    "sentences": 80,
+    "paragraph": 30,
+    "paragraphs": 30,
+    "line": 120,
+    "lines": 120,
+}
+def _length_cap_instruction(m: str) -> tuple[str, str] | None:
+    """If the user asked for a max length, return (system instruction, compact trace token)."""
+    if len(m) < 24:
+        return None
+    patterns = [
+        re.compile(
+            r"\b(?:in under|at most|no more than|under|within|no longer than)\s+(\d{1,4})\s+"
+            r"(words?|sentences?|paragraphs?|lines?)\b"
+        ),
+        re.compile(r"\b(?:max|maximum)\s+(\d{1,4})\s+(words?|sentences?|paragraphs?|lines?)\b"),
+        re.compile(r"\b(\d{1,4})\s+words?\s+(?:max|maximum|only|at most|or less)\b"),
+    ]
+    for rx in patterns:
+        mo = rx.search(m)
+        if not mo:
+            continue
+        n_raw, unit = mo.group(1), mo.group(2).lower()
+        try:
+            n = int(n_raw)
+        except ValueError:
+            continue
+        cap = _LEN_CAP_UNIT_MAX.get(unit)
+        if cap is None or n < 1 or n > cap:
+            continue
+        if unit.startswith("word"):
+            plural, short = "words", "w"
+        elif unit.startswith("sentence"):
+            plural, short = "sentences", "s"
+        elif unit.startswith("paragraph"):
+            plural, short = "paragraphs", "p"
+        else:
+            plural, short = "lines", "ln"
+        trace_tok = f"len_cap={n}{short}"
+        instr = (
+            f"The user requested a **tight length cap** of about **{n} {plural}** for the full assistant answer "
+            f"(including lists or headings). Stay at or under this cap; if it is impossible, say so in one short sentence "
+            "then give the closest fit."
+        )
+        return instr, trace_tok
+    return None
+def _code_only_instruction(m: str) -> str | None:
+    """Detect requests for code-heavy output with almost no prose."""
+    if len(m) < 18:
+        return None
+    if re.search(
+        r"\b(just the code|code only|only code|no prose,?\s*just code|no explanation,?\s*just (?:the )?code|"
+        r"skip (?:the )?explanation|omit (?:the )?explanation|(?:give|send|return)\s+me\s+only\s+the\s+code|"
+        r"output\s+only\s+(?:the\s+)?code)\b",
+        m,
+    ):
+        return (
+            "The user asked for **code-first output**: put the working solution in **one fenced markdown code block** "
+            "when the answer is code; keep any non-code text to **at most one short sentence** or omit it if the code "
+            "is self-explanatory."
+        )
+    return None
+def _guided_discovery_instruction(m: str) -> tuple[str, str] | None:
+    """User wants hints, nudges, or questions instead of a fully worked answer on the first reply."""
+    if len(m) < 36:
+        return None
+    if not re.search(
+        r"\b(don'?t (give|spell|hand) (me )?(the )?full (answer|solution)|don'?t spoil the solution|"
+        r"hints? only|only hints|guide me with (hints|questions)|nudge me (in the right direction|toward)|"
+        r"i want to (figure|work) it out myself|socratic(\s+method)?|"
+        r"lead me to (the )?answer|questions first instead of answering|"
+        r"without (giving|spelling) (out )?(the )?(whole )?solution)\b",
+        m,
+    ):
+        return None
+    # Require a problem-seeking cue so casual chat ("no spoilers for the movie") does not flip modes.
+    if not re.search(
+        r"\b(why|how|explain|prove|derive|solve|puzzle|homework|problem|exercise|bug|code|implement|"
+        r"design|compare|understand|learn|teach|practice|algorithm|proof|debug|refactor)\b",
+        m,
+    ):
+        return None
+    instr = (
+        "The user asked for **guided discovery** (Socratic / hint-first): prefer short **questions**, "
+        "**nudges**, and **partial hints** over a complete solution in this turn. "
+        "If one concrete step is essential, show **at most one** move, then check whether they want to continue. "
+        "Offer the full worked answer if they say they are stuck or ask you to finish."
+    )
+    return instr, "guided"
+def _ephemeral_privacy_instruction(m: str) -> tuple[str, str] | None:
+    """User asked not to treat this turn as content to persist (memory / logging)."""
+    if len(m) < 22:
+        return None
+    if re.search(
+        r"\b(off the record|no memory for this|nothing persisted|ephemeral question|ephemeral chat|"
+        r"don'?t log this|don'?t persist this|"
+        r"don'?t (?:remember|store) (?:this|that|it|anything)|"
+        r"do not (?:remember|store) (?:this|that|it)|"
+        r"please don'?t (?:remember|store) (?:this|that|it)|"
+        r"forget this after|don'?t save (?:this|that)\s+to\s+memory)\b",
+        m,
+    ):
+        instr = (
+            "The user signaled **ephemeral intent** for this reply: do **not** invite `/remember`, `/session`, or "
+            "long-term note-taking for this content; avoid urging them to store secrets, API keys, or passwords. "
+            "Still answer helpfully within normal safety and deployment limits."
+        )
+        return instr, "ephemeral"
+    return None
+def _accessibility_sr_instruction(m: str) -> tuple[str, str] | None:
+    """User wants screen-reader / WCAG-minded answer structure (linear, semantic headings)."""
+    if len(m) < 44:
+        return None
+    if not re.search(
+        r"\b(screen[- ]?reader|screenreader|nvda|jaws|voiceover|talkback|orca|"
+        r"wcag(?:\s+[0-9]{1,2}(?:\.[0-9])?)?|\ba11y\b|accessibility|accessible to|"
+        r"blind users?|low vision|visually impaired)\b",
+        m,
+    ):
+        return None
+    audience = re.search(
+        r"\bfor\s+(?:blind|low-vision|screen[- ]?reader|a11y)\s+(?:users?|readers?|audiences?|visitors?)?\b",
+        m,
+    )
+    format_rq = re.search(
+        r"\b(friendly|friendlier|structure|structured|layout|linear|heading|headings|semantic|"
+        r"readable|reformat|format this|annotate|describe (?:the\s+)?(?:chart|diagram|figure|image)|"
+        r"please (?:reply|answer|write|help|summarize|reformat|structure)|"
+        r"how (?:should|can) i (?:write|format|publish))\b",
+        m,
+    )
+    if not audience and not format_rq:
+        return None
+    instr = (
+        "The user asked for **screen-reader–friendly / accessibility-aware** formatting: prefer a **clear linear reading order**; "
+        "use real markdown heading lines for sections when the answer is long; do **not** rely on a table as the **only** "
+        "place critical facts appear—repeat key facts in prose if you use a table; briefly describe any chart or diagram "
+        "in words; keep emoji sparse and never the sole carrier of meaning."
+    )
+    return instr, "a11y"
+def _embedded_simple_audience(m: str) -> bool:
+    """True if a longer prompt asks for child-level / lay explanations (ELI5-style) in prose."""
+    if len(m) < 40:
+        return False
+    if re.search(
+        r"\b(expert mode|technical audience|assume i'?m technical|phd level|for experts|deep technical)\b",
+        m,
+    ):
+        return False
+    if not re.search(
+        r"\b(eli5|explain like i'?m(?:\s+a)? five|like i'?m(?:\s+a)? five\b|"
+        r"for (?:my )?kids to understand|total beginner|i'?m\s+a\s+beginner\b|beginner\s+here\b|"
+        r"non-technical (?:parent|reader|manager|audience)|"
+        r"lay audience|no technical background|zero prior knowledge)\b",
+        m,
+    ):
+        return False
+    return bool(
+        re.search(
+            r"\b(why|how|what|when|where|explain|describe|tell me|help me (?:to )?understand|walk me through|"
+            r"learn about|new to)\b",
+            m,
+        )
+    )
+def _embedded_technical_audience(m: str) -> bool:
+    """True if a longer prompt asks for expert-depth explanations (not short *Expert mode* controls)."""
+    if len(m) < 40:
+        return False
+    if re.search(
+        r"\b(eli5|explain like i'?m(?:\s+a)? five|total beginner|i'?m\s+a\s+beginner\b|beginner\s+here\b|"
+        r"lay audience|no technical background|zero prior knowledge|explain simply)\b",
+        m,
+    ):
+        return False
+    if not re.search(
+        r"\b(expert mode|technical audience|assume i'?m technical|phd level|for experts|deep technical|"
+        r"staff engineer audience|senior (?:sre|eng|engineer) audience|"
+        r"use jargon freely|skip the basics|don'?t dumb (?:it )?down|"
+        r"peer[- ]level technical|internals[- ]focused|implementation[- ]heavy)\b",
+        m,
+    ):
+        return False
+    return bool(
+        re.search(
+            r"\b(why|how|what|when|where|explain|describe|tell me|walk me through|"
+            r"compare|design|architect|debug|troubleshoot|analyze|review|implement)\b",
+            m,
+        )
+    )
+def _embedded_register_tone(m: str) -> str | None:
+    """One-shot formal vs casual register when prose names an audience (not the short *Formal tone* control)."""
+    if len(m) < 48:
+        return None
+    formal = re.search(
+        r"\b(board-ready|for regulators|regulatory filing|formal memo|audit[- ]friendly|"
+        r"client-facing|for (?:the\s+)?board(?:\s+of\s+directors)?\b|for leadership review|"
+        r"executive summary for|c[- ]suite|for executives|board presentation|investor[- ]ready|"
+        r"sec filing tone)\b",
+        m,
+    )
+    casual = re.search(
+        r"\b(slack message|teams message to the team|keep it casual|casual tone|friendly teammate|"
+        r"like you(?:'re|\s+are)\s+my coworker|water cooler|informal note|keep it light|"
+        r"pub chat|chatty tone)\b",
+        m,
+    )
+    if formal and not casual:
+        return "formal"
+    if casual and not formal:
+        return "casual"
+    return None
+def _embedded_json_output(m: str) -> bool:
+    """True if a longer prompt asks for JSON-shaped output (not the short *Answer in JSON* control line)."""
+    if len(m) < 40:
+        return False
+    if re.search(
+        r"\b(no json|not json|avoid json|skip json|plain text only|no structured output|"
+        r"don'?t use json|without json)\b",
+        m,
+    ):
+        return False
+    return bool(
+        re.search(
+            r"\b(valid json|return json|reply in json|answer in json|json output|structured json|"
+            r"json object|json array|as json\b|as a json|machine[- ]readable json|emit json|"
+            r"serialize (?:to|as) json|output as json|respond with json)\b",
+            m,
+        )
+    )
+def _embedded_speculation(m: str) -> str | None:
+    """``strict`` vs ``creative`` speculation level (not short *No speculation* / *Brainstorm freely* controls)."""
+    if len(m) < 44:
+        return None
+    creative = bool(
+        re.search(
+            r"\b(brainstorm freely|speculate freely|wild ideas(?:\s+ok)?|creative speculation|"
+            r"go ahead and guess|reasonable guesses welcome|speculate a bit|"
+            r"blue[- ]sky (?:thinking|ideas)|throw out (?:some )?possibilities|"
+            r"explore hypotheticals|what[- ]if scenarios (?:are )?welcome|"
+            r"ideation (?:mode|session)|open[- ]ended brainstorming)\b",
+            m,
+        )
+    )
+    strict = bool(
+        re.search(
+            r"\b(don'?t guess|no guessing|avoid guessing|only high confidence|stick to (?:the\s+)?facts|"
+            r"avoid halluc|no hallucinations|don'?t hallucinate|if you don'?t know say|"
+            r"if unsure say|say when you(?:'re|\s+are)\s+unsure|no speculation|avoid speculation|"
+            r"don'?t speculate|fact[- ]checked|grounded only|evidence[- ]based only|"
+            r"only if (?:you(?:'re|\s+are)\s+)?(?:certain|sure)|do not invent (?:facts|numbers))\b",
+            m,
+        )
+    )
+    if creative and strict:
+        return None
+    if strict:
+        return "strict"
+    if creative:
+        return "creative"
+    return None
+def _embedded_answer_lead(m: str) -> str | None:
+    """``tldr_first`` vs ``direct`` answer opening (not short *TLDR first* / *Answer directly* controls)."""
+    if len(m) < 44:
+        return None
+    direct = bool(
+        re.search(
+            r"\b(no tldr|skip (?:the )?summary|answer directly|without a (?:summary|tldr)|"
+            r"no executive summary|don'?t (?:add|give) a tldr|direct answer only|"
+            r"jump straight to the answer|no summary (?:upfront|at the top)|"
+            r"get straight to the (?:answer|point)|omit (?:the )?(?:opening )?summary)\b",
+            m,
+        )
+    )
+    tldr = bool(
+        re.search(
+            r"\b(tl;?dr first|tldr first|lead with (?:a\s+)?(?:one[- ]line\s+)?summary|summary first|"
+            r"executive summary first|bottom line up front|bluf|"
+            r"start with (?:a\s+)?(?:short\s+)?summary|headline first|"
+            r"give me the (?:key\s+)?takeaway first)\b",
+            m,
+        )
+    )
+    if direct and tldr:
+        return None
+    if direct:
+        return "direct"
+    if tldr:
+        return "tldr_first"
+    return None
+def _embedded_actionability(m: str) -> str | None:
+    """``commands`` vs ``conceptual`` actionability (not short *Make it actionable* / *Conceptual only* controls)."""
+    if len(m) < 44:
+        return None
+    conceptual = bool(
+        re.search(
+            r"\b(conceptual only|high level only|no commands|without commands|no shell commands|"
+            r"avoid command dumps|don'?t include (?:bash|shell|terminal) commands|"
+            r"focus on concepts(?:\s+and\s+rationale)?|strategic discussion only|"
+            r"architecture overview only|no runnable snippets|theory and tradeoffs only)\b",
+            m,
+        )
+    )
+    commands = bool(
+        re.search(
+            r"\b(include (?:a\s+)?(?:bash|sh|zsh|powershell) snippet|run(?:nable)? commands?|"
+            r"copy[- ]paste (?:into )?(?:the\s+)?(?:terminal|shell)|"
+            r"curl (?:one[- ]?liner|example)|one[- ]liner (?:for|to)|"
+            r"bash one[- ]liner|powershell command|terminal commands?|"
+            r"give me (?:the\s+)?(?:exact\s+)?commands?|include kubectl|"
+            r"docker (?:run|compose) (?:example|snippet)|(?:pip|npm|pnpm|yarn) install (?:line|command)|"
+            r"ready[- ]to[- ]run (?:script|snippet)|paste(?:able)? commands?)\b",
+            m,
+        )
+    )
+    if conceptual and commands:
+        return None
+    if conceptual:
+        return "conceptual"
+    if commands:
+        return "commands"
+    return None
+def _embedded_confidence_tone(m: str) -> str | None:
+    """``transparent`` vs ``assertive`` confidence (not short *Flag assumptions* / *Be decisive* controls)."""
+    if len(m) < 44:
+        return None
+    skip_transparent = bool(
+        re.search(
+            r"\b(no assumptions? section|skip (?:the\s+)?assumptions?|don'?t list assumptions|"
+            r"without caveats|no caveats|omit limitations)\b",
+            m,
+        )
+    )
+    transparent = (
+        not skip_transparent
+        and (
+            bool(
+                re.search(
+                    r"\b(state|list|spell out|call out|identify|enumerate|label)\s+"
+                    r"(?:your\s+|the\s+|our\s+|key\s+|main\s+)?(?:key\s+|main\s+)?assumptions?\b",
+                    m,
+                )
+            )
+            or bool(
+                re.search(
+                    r"\b(assumptions?\s+and\s+limitations?|limitations?\s+and\s+caveats?|"
+                    r"limitations?\s+section|caveats?\s+(?:first|upfront|at\s+the\s+top)|"
+                    r"upfront\s+caveats?|scope\s+and\s+assumptions?|boundary\s+conditions?|"
+                    r"what\s+(?:we\s+)?(?:are\s+)?assuming\b|"
+                    r"explicit(?:ly)?\s+about\s+(?:limitations?|uncertainty|what\s+we\s+don'?t\s+know)|"
+                    r"where\s+this\s+(?:breaks?\s+down|stops?\s+working|doesn'?t\s+apply))\b",
+                    m,
+                )
+            )
+            or bool(
+                re.search(
+                    r"\b(flag|surface|highlight)\s+(?:key\s+)?(?:uncertainties|unknowns|gaps|risk\s+factors)\b",
+                    m,
+                )
+            )
+        )
+    )
+    assertive = bool(
+        re.search(
+            r"\b(be decisive|don'?t hedge|give firm answers?|minimal hedging|"
+            r"sound\s+confident|avoid disclaimers|confident (?:recommendation|tone)|"
+            r"take a clear stance|no throat[- ]clearing|decisive recommendation)\b",
+            m,
+        )
+    )
+    if transparent and assertive:
+        return None
+    if transparent:
+        return "transparent"
+    if assertive:
+        return "assertive"
+    return None
+def _embedded_example_density(m: str) -> str | None:
+    """``rich`` or ``sparse`` from prose (not the short *Include examples* / *Skip examples* control lines)."""
+    if len(m) < 44:
+        return None
+    sparse = bool(
+        re.search(
+            r"\b(skip examples?|don'?t add examples?|don'?t include examples?|"
+            r"without examples?|keep (?:it\s+)?abstract|theory[- ]only|abstract only|"
+            r"example[- ]free|no examples? (?:please|in your (?:answer|reply))|"
+            r"avoid illustrative examples?)\b",
+            m,
+        )
+    )
+    rich = bool(
+        re.search(
+            r"\b(include (?:at\s+least\s+)?(?:one|two|a few)\s+concrete examples?|"
+            r"at least one (?:short\s+)?concrete example|"
+            r"illustrate (?:this|that|it)\s+with (?:a\s+)?(?:concrete\s+|real[- ]world\s+)?example|"
+            r"worked example|walk(?:\s+me)? through (?:a\s+)?(?:small|tiny|toy|minimal)\s+example|"
+            r"\b(?:a\s+)?toy example\b|miniature scenario|"
+            r"ground (?:this|it|your answer) in (?:a\s+)?(?:concrete\s+)?example|"
+            r"give (?:me\s+)?a (?:concrete\s+)?example|"
+            r"show (?:me\s+)?(?:this\s+)?with (?:a\s+)?(?:concrete\s+)?example)\b",
+            m,
+        )
+    )
+    if sparse and rich:
+        return None
+    if sparse:
+        return "sparse"
+    if rich:
+        return "rich"
+    return None
+def _embedded_exposition_order(m: str) -> str | None:
+    """``definitions_first`` vs ``intuition_first`` from prose (not short *Definitions first* controls)."""
+    if len(m) < 48:
+        return None
+    if re.search(
+        r"\b(skip definitions first|don'?t start with definitions|"
+        r"no formal definitions upfront)\b",
+        m,
+    ):
+        return None
+    if re.search(
+        r"\b(skip the intuition|cut the warm[- ]?up|no hand[- ]?wavy intro)\b",
+        m,
+    ):
+        return None
+    defn = bool(
+        re.search(
+            r"\b(define (?:the\s+)?(?:key\s+)?terms? (?:first|before|upfront)|"
+            r"definitions?\s+(?:first|before|upfront)|"
+            r"start with (?:a\s+)?(?:brief\s+)?definition|"
+            r"formal definitions? (?:first|before)|"
+            r"precise definitions? before|"
+            r"terminology (?:first|upfront)|"
+            r"establish definitions before|"
+            r"glossary[- ]style (?:intro|opening)|"
+            r"define jargon before)\b",
+            m,
+        )
+    )
+    intu = bool(
+        re.search(
+            r"\b(intuition (?:first|before (?:the\s+)?(?:math|formal|proof|details?))|"
+            r"big[- ]picture (?:first|before|then)|"
+            r"high[- ]level intuition (?:first|before)|"
+            r"motivation before (?:the\s+)?(?:formal|proof|math)|"
+            r"informal (?:picture|overview) before|"
+            r"start with (?:the\s+)?(?:big\s+picture|intuition|high[- ]level sketch)|"
+            r"warm(?:\s+up)? with (?:an?\s+)?intuitive)\b",
+            m,
+        )
+    )
+    if defn and intu:
+        return None
+    if defn:
+        return "definitions_first"
+    if intu:
+        return "intuition_first"
+    return None
+def _embedded_followup_close(m: str) -> str | None:
+    """``minimal`` vs ``suggest`` from prose (not short *No follow-up questions* controls)."""
+    if len(m) < 48:
+        return None
+    minimal = bool(
+        re.search(
+            r"\b(no questions? at the end|don'?t (?:ask|end) with (?:a\s+)?questions?|"
+            r"don'?t ask if i need (?:anything|more) else|don'?t ask whether i need more|"
+            r"skip (?:the\s+)?(?:stock\s+)?closer|no follow[- ]up questions (?:at\s+the\s+)?(?:end|please)?|"
+            r"don'?t prompt for follow[- ]ups?|finish crisply|stop after the core answer|"
+            r"avoid rhetorical closers?|no offers? to help further|"
+            r"don'?t (?:close|end) with (?:an?\s+)?(?:offer|invitation) to continue)\b",
+            m,
+        )
+    )
+    suggest = bool(
+        re.search(
+            r"\b(suggest next steps|optional next steps at the end|"
+            r"end with (?:brief\s+)?(?:actionable\s+)?next steps|"
+            r"close with suggested next actions|what should we do next|"
+            r"offer ways to go deeper|give me follow[- ]ups? i can take|"
+            r"recommend what to do next|include (?:optional\s+)?next steps)\b",
+            m,
+        )
+    )
+    if minimal and suggest:
+        return None
+    if minimal:
+        return "minimal"
+    if suggest:
+        return "suggest"
+    return None
+def _embedded_clarify_first(m: str) -> str | None:
+    """``on`` vs ``off`` from prose (not short *Clarify first* / *No clarifying questions* controls)."""
+    if len(m) < 48:
+        return None
+    off = bool(
+        re.search(
+            r"\b(no clarifying questions (?:first|please)?|don'?t ask clarifying questions|"
+            r"skip clarifying questions|answer without asking questions first|"
+            r"don'?t (?:pause to\s+)?ask questions first|"
+            r"give (?:your\s+)?best answer without asking|"
+            r"don'?t interrogate me first|skip the q&a preamble|"
+            r"answer immediately (?:even\s+)?if (?:the\s+)?(?:spec|specs) (?:is|are) incomplete)\b",
+            m,
+        )
+    )
+    on = bool(
+        re.search(
+            r"\b(ask clarifying questions before (?:you\s+)?answer|"
+            r"clarify (?:any\s+)?ambiguities before|"
+            r"if anything is unclear ask me first|"
+            r"before you (?:answer|dive in) ask (?:me\s+)?(?:what\s+you\s+need|any questions)|"
+            r"pause and ask (?:me\s+)?(?:short\s+)?questions if|"
+            r"confirm my (?:constraints|requirements) before|"
+            r"ask what you need (?:to know )?first|"
+            r"i may have left details out[-—]\s*ask|"
+            r"feel free to ask (?:me\s+)?(?:1[-–]3\s+)?clarifying questions first)\b",
+            m,
+        )
+    )
+    if on and off:
+        return None
+    if off:
+        return "off"
+    if on:
+        return "on"
+    return None
+def _embedded_section_headings(m: str) -> str | None:
+    """``prefer`` vs ``avoid`` for markdown ##/### structure (not short *Use section headings* controls)."""
+    if len(m) < 48:
+        return None
+    avoid = bool(
+        re.search(
+            r"\b(flat answer|no section headings|avoid markdown headings|"
+            r"no (?:##|hash)\s*style headings|without (?:##|markdown) title lines|"
+            r"continuous prose (?:only|without headings)|"
+            r"don'?t use (?:leading\s+)?#+\s*headings?|"
+            r"skip (?:the\s+)?##\s*headers?)\b",
+            m,
+        )
+    )
+    prefer = bool(
+        re.search(
+            r"\b(use (?:markdown\s+)?(?:section\s+)?headings|organize with (?:markdown\s+)?headings|"
+            r"structure (?:the\s+)?answer with (?:clear\s+)?headings|"
+            r"break (?:it|this|the answer) into (?:titled\s+)?sections|"
+            r"(?:clear\s+)?markdown headings for each|"
+            r"##\s*(?:or|/)\s*###\s*headings|"
+            r"top[- ]level headings for each (?:major\s+)?(?:topic|section))\b",
+            m,
+        )
+    )
+    if avoid and prefer:
+        return None
+    if avoid:
+        return "avoid"
+    if prefer:
+        return "prefer"
+    return None
+def _embedded_analogy_use(m: str) -> str | None:
+    """``prefer`` vs ``avoid`` for analogies/metaphors (not short *Use analogies* / *No analogies* controls)."""
+    if len(m) < 48:
+        return None
+    avoid = bool(
+        re.search(
+            r"\b(no analogies|skip metaphors|avoid metaphors|skip the analogies|"
+            r"without analogies or metaphors|literal (?:explanations?|wording) only|"
+            r"don'?t use analogies|don'?t use metaphors|no cute comparisons|"
+            r"stick to literal (?:technical\s+)?(?:language|description|wording)|"
+            r"keep (?:it\s+)?strictly literal)\b",
+            m,
+        )
+    )
+    prefer = bool(
+        re.search(
+            r"\b(use (?:a\s+)?(?:helpful\s+|tight\s+)?analogy|"
+            r"explain (?:it\s+)?with (?:a\s+)?(?:simple\s+)?(?:real[- ]world\s+)?analogy|"
+            r"include (?:a\s+)?(?:brief\s+)?(?:metaphor|analogy)|"
+            r"liken (?:this|it) to (?:something|a\s+familiar)|"
+            r"compare (?:this|it)\s+to (?:a\s+)?(?:real[- ]world|everyday)|"
+            r"map (?:this|it) to an everyday example|"
+            r"metaphor that helps|ground (?:the\s+)?idea in (?:an?\s+)?analogy)\b",
+            m,
+        )
+    )
+    if avoid and prefer:
+        return None
+    if avoid:
+        return "avoid"
+    if prefer:
+        return "prefer"
+    return None
+def _embedded_term_emphasis(m: str) -> str | None:
+    """``highlight`` vs ``minimal`` inline bold (not short *Bold key terms* controls)."""
+    if len(m) < 48:
+        return None
+    minimal = bool(
+        re.search(
+            r"\b(minimal bold|don'?t overuse bold|avoid excessive bold|"
+            r"sparse bold|keep bold (?:to a )?minimum|"
+            r"no bold except (?:for )?code|plain text without bold|"
+            r"don'?t bold every|avoid bolding (?:whole|entire) sentences)\b",
+            m,
+        )
+    )
+    highlight = bool(
+        re.search(
+            r"\b(bold (?:the\s+)?(?:key\s+)?terms|highlight (?:the\s+)?(?:key\s+)?(?:terms|phrases)|"
+            r"emphasize (?:the\s+)?(?:key\s+)?(?:terms|keywords)|"
+            r"make (?:the\s+)?key terms stand out|"
+            r"use bold (?:on|for) (?:a\s+)?(?:few\s+)?(?:key\s+)?(?:terms|phrases|keywords)|"
+            r"so (?:execs|leadership|managers) can scan.{0,50}bold)\b",
+            m,
+        )
+    )
+    if minimal and highlight:
+        return None
+    if minimal:
+        return "minimal"
+    if highlight:
+        return "highlight"
+    return None
+def _embedded_acronym_style(m: str) -> str | None:
+    """``spell_out`` vs ``terse`` acronym handling (not short *Spell out acronyms* controls)."""
+    if len(m) < 48:
+        return None
+    terse = bool(
+        re.search(
+            r"\b(assume (?:i|we) know acronyms|don'?t expand acronyms|"
+            r"keep acronyms as[- ]is|skip acronym expansion|"
+            r"no need to spell out acronyms|acronym[- ]literate (?:audience|readers?)|"
+            r"terse acronyms only)\b",
+            m,
+        )
+    )
+    spell = bool(
+        re.search(
+            r"\b(spell out acronyms|expand acronyms (?:on|at) first use|"
+            r"define acronyms when you (?:use|introduce)|"
+            r"write out acronyms (?:on|at) first mention|"
+            r"full form (?:once|on first mention).{0,40}(?:acronym|initialism)|"
+            r"expand (?:each\s+)?(?:api|sla|sso|gdpr|hipaa|pci)[- ]style (?:term|acronym)|"
+            r"for (?:auditors|compliance|non-technical).{0,50}spell out)\b",
+            m,
+        )
+    )
+    if spell and terse:
+        return None
+    if terse:
+        return "terse"
+    if spell:
+        return "spell_out"
+    return None
+def _embedded_risk_posture(m: str) -> str | None:
+    """``conservative`` vs ``pragmatic`` recommendation tone (not short *Be risk averse* controls)."""
+    if len(m) < 48:
+        return None
+    pragmatic = bool(
+        re.search(
+            r"\b(optimize for speed|good enough is fine|be pragmatic about|"
+            r"avoid over[- ]engineering|ship (?:it )?fast|move fast (?:and|&)|"
+            r"time[- ]efficient (?:fix|approach|recommendation)|"
+            r"practical trade[- ]offs over perfection|"
+            r"don'?t gold[- ]plate|bias toward shipping)\b",
+            m,
+        )
+    )
+    conservative = bool(
+        re.search(
+            r"\b(err on the side of safety|be risk[- ]averse|"
+            r"risk[- ]averse (?:recommendation|approach)|"
+            r"choose the (?:safest|lower[- ]risk) option|"
+            r"minimize (?:downside|blast radius)|"
+            r"prefer (?:safer|low[- ]risk) (?:options?|paths?)|"
+            r"conservative (?:recommendation|rollout|approach)|"
+            r"safety[- ]first (?:for|on) (?:this|the) (?:rollout|migration|change))\b",
+            m,
+        )
+    )
+    if conservative and pragmatic:
+        return None
+    if conservative:
+        return "conservative"
+    if pragmatic:
+        return "pragmatic"
+    return None
+def _embedded_quote_style(m: str) -> str | None:
+    """``quote`` vs ``paraphrase`` when relying on supplied FAQ excerpts (not short *Quote the FAQ* controls)."""
+    if len(m) < 48:
+        return None
+    src = r"(?:faq|excerpt|policy|knowledge base|kb article|documentation|provided (?:text|docs))"
+    paraphrase = bool(
+        re.search(
+            rf"\b(paraphrase (?:the )?{src}|paraphrase only|"
+            rf"(?:don'?t|do not) quote (?:the )?{src}|no direct quotes? from (?:the )?{src}|"
+            rf"summarize (?:the )?{src} in your own words|"
+            rf"avoid quoting (?:the )?{src}|in your own words.{0,40}(?:faq|excerpt))\b",
+            m,
+        )
+    )
+    quote = bool(
+        re.search(
+            rf"\b((?<!not )(?<!don't )quote (?:the )?{src}|direct quotes? from (?:the )?{src}|"
+            rf"cite with (?:direct )?quotes? when (?:you )?(?:use|reference) (?:the )?{src}|"
+            rf"verbatim (?:quotes?|passages?) from (?:the )?{src}|"
+            rf"include (?:a )?(?:short )?verbatim quote.{0,50}(?:faq|excerpt)|"
+            rf"when you rely on (?:the )?{src}.{0,50}quote)\b",
+            m,
+        )
+    )
+    if quote and paraphrase:
+        return None
+    if quote:
+        return "quote"
+    if paraphrase:
+        return "paraphrase"
+    return None
+def _embedded_emoji_style(m: str) -> str | None:
+    """``include`` vs ``avoid`` emoji in replies (not short *Use emoji* / *No emoji* controls)."""
+    if len(m) < 48:
+        return None
+    avoid = bool(
+        re.search(
+            r"\b(no emojis? in (?:your|the) reply|avoid emoji|emoji[- ]free (?:reply|tone)|"
+            r"don'?t use emoji|do not use emoji|keep (?:it\s+)?(?:strictly\s+)?professional.{0,40}no emoji|"
+            r"without emoji|skip (?:the\s+)?emoji|no cute emoji|"
+            r"plain text only.{0,30}no emoji)\b",
+            m,
+        )
+    )
+    include = bool(
+        re.search(
+            r"\b(use (?:a few\s+)?(?:tasteful\s+)?emoji|include emoji|emoji (?:are|is) ok|"
+            r"emoji welcome|feel free to use emoji|sprinkle (?:in\s+)?emoji|"
+            r"a few emoji (?:are|is) fine|light emoji (?:are|is) ok|"
+            r"you may use emoji|add (?:a few\s+)?emoji (?:if|when) (?:helpful|appropriate))\b",
+            m,
+        )
+    )
+    if avoid and include:
+        return None
+    if avoid:
+        return "avoid"
+    if include:
+        return "include"
+    return None
+def _embedded_counterpoint_tone(m: str) -> str | None:
+    """``challenge`` vs ``supportive`` pushback on plans (not short *Challenge my assumptions* controls)."""
+    if len(m) < 52:
+        return None
+    ctx = (
+        r"\b(plan|plans|design|approach|idea|ideas|architecture|proposal|strategy|"
+        r"implementation|rollout|pitch|deck|draft|thesis|launch|release|migration|schema|"
+        r"deployment|code|system)\b"
+    )
+    gentle = bool(
+        re.search(r"\b(don'?t challenge|be gentle|go easy on me|no criticism|don'?t be harsh)\b", m)
+    )
+    challenge = bool(
+        not gentle
+        and re.search(
+            r"\b(red team|red-team|stress[- ]?test|pick apart|tear down|what am i missing|sanity check|"
+            r"challenge my|poke holes|find (?:weaknesses|gaps|flaws)|critique (?:my|this|our)|"
+            r"devil'?s advocate)\b",
+            m,
+        )
+        and re.search(ctx, m)
+    )
+    supportive = bool(
+        re.search(
+            r"\b(be supportive (?:of|about|with)|assume good intent|encourage my (?:idea|plan|proposal)|"
+            r"constructive and supportive|help me build on (?:this|my) (?:idea|plan)|"
+            r"coach me through (?:this|my) (?:idea|plan|pitch)|"
+            r"gentle (?:feedback|pushback) on (?:my|this|our)|"
+            r"avoid harsh criticism|frame improvements as next steps|"
+            r"lean supportive and (?:specific|actionable))\b",
+            m,
+        )
+        and re.search(ctx, m)
+    )
+    if challenge and supportive:
+        return None
+    if challenge:
+        return "challenge"
+    if supportive:
+        return "supportive"
+    return None
+def _embedded_math_detail(m: str) -> str | None:
+    """``show_work`` vs ``final_only`` for math-like answers (not short *Show your work* controls)."""
+    if len(m) < 44:
+        return None
+    show = bool(
+        re.search(
+            r"\b(show your work|show (?:all )?(?:the )?steps|with (?:a )?derivation|prove (that|it)|rigorously|"
+            r"walk through (?:the )?derivation|show intermediate steps|step[- ]by[- ]step derivation|"
+            r"derive (?:it|the result) (?:step by step|explicitly))\b",
+            m,
+        )
+    )
+    final = bool(
+        re.search(
+            r"\b(final answer only|no derivation|skip (?:the )?steps|just (?:give )?(?:me )?the (?:final )?result|"
+            r"don'?t show your work|do not show your work|answer without (?:showing )?steps|"
+            r"no intermediate steps|closed[- ]form (?:answer|result) only|"
+            r"(?:numerical |numeric )?answer only.{0,30}no steps)\b",
+            m,
+        )
+    )
+    if show and final:
+        return None
+    if show:
+        return "show_work"
+    if final and re.search(
+        r"\b(math|equation|integral|derivative|probability|calculate|calculus|algebra|"
+        r"proof|formula|theorem|matrix|solve|statistics|bayes|variance|expected value)\b",
+        m,
+    ):
+        return "final_only"
+    return None
+def _embedded_faq_grounding(m: str) -> str | None:
+    """``strict`` vs ``relaxed`` FAQ/RAG grounding (not short *Strict FAQ* / *Relaxed FAQ* controls)."""
+    if len(m) < 48:
+        return None
+    src = r"(?:faq|excerpt|policy|knowledge base|kb article|documentation|provided excerpts|retrieved passages)"
+    strict = bool(
+        re.search(
+            rf"\b(stick to (?:the )?{src}|only use (?:the )?{src}|"
+            rf"only trust (?:the )?{src}|faq[- ]only (?:for|on) (?:this|the)|"
+            rf"strict faq (?:grounding|only)|if (?:it(?:'s| is) )?not in (?:the )?{src}.{0,40}(?:say|admit)|"
+            rf"don'?t go beyond (?:the )?{src}|must be supported by (?:the )?{src}|"
+            rf"policy claims must come from (?:the )?{src}|"
+            rf"grounded strictly in (?:the )?{src})\b",
+            m,
+        )
+    )
+    relaxed = bool(
+        re.search(
+            rf"\b(faq plus general knowledge|mix (?:the )?{src} with general knowledge|"
+            rf"relaxed faq (?:grounding|mode)|"
+            rf"general knowledge (?:is )?ok.{0,50}(?:faq|excerpt|policy|documentation)|"
+            rf"(?:faq|excerpt|policy|documentation).{0,50}general knowledge (?:is )?ok|"
+            rf"supplement (?:the )?{src} with (?:brief )?general[- ]knowledge|"
+            rf"beyond (?:the )?{src} you may add (?:brief )?general context)\b",
+            m,
+        )
+    )
+    if strict and relaxed:
+        return None
+    if strict:
+        return "strict"
+    if relaxed:
+        return "relaxed"
+    return None
+def _embedded_code_block_style(m: str) -> str | None:
+    """``fenced`` vs ``inline`` code layout (not short *Use code fences* / *Inline code only* controls)."""
+    if len(m) < 48:
+        return None
+    code_ctx = (
+        r"\b(code|snippet|command|script|bash|shell|python|curl|kubectl|docker|sql|"
+        r"regex|yaml|terraform|powershell|config|api call|terminal)\b"
+    )
+    fenced = bool(
+        re.search(
+            r"\b(use code fences|fenced code blocks?|markdown code fences?|"
+            r"triple[- ]backtick fences?|put (?:the )?(?:code|commands?|script) in (?:a )?fenced block|"
+            r"use markdown fenced code blocks?|wrap (?:the )?(?:code|snippet) in (?:triple )?backticks)\b",
+            m,
+        )
+    )
+    inline = bool(
+        re.search(
+            r"\b(inline code only|no triple backticks?|no fenced code blocks?|"
+            r"avoid code fences|single backticks? only|don'?t use fenced blocks?|"
+            r"keep (?:code|snippets?) inline|inline backticks? only)\b",
+            m,
+        )
+    )
+    if not fenced and not inline:
+        return None
+    if not re.search(code_ctx, m):
+        return None
+    if fenced and inline:
+        return None
+    if fenced:
+        return "fenced"
+    return "inline"
+def _embedded_reply_format(m: str) -> str | None:
+    """``bullets`` vs ``prose`` list layout (not short *Use bullet points* / *No bullets* controls)."""
+    if len(m) < 48:
+        return None
+    prose = bool(
+        re.search(
+            r"\b(no bullets?|plain paragraphs?|prose only|stop using lists|"
+            r"continuous prose only|avoid bullet lists?|write in paragraphs|"
+            r"paragraph form only|don'?t use bullet points?|"
+            r"keep (?:it\s+)?in (?:flowing )?prose|not as a bulleted list)\b",
+            m,
+        )
+    )
+    bullets = bool(
+        re.search(
+            r"\b(bullet points?|bulleted list|use bullets|format as bullets|"
+            r"list (?:the\s+)?key points in bullets|markdown bullets?|"
+            r"give me a bulleted list|bullet(?:ed)? format)\b",
+            m,
+        )
+    )
+    if prose and bullets:
+        return None
+    if prose:
+        return "prose"
+    if bullets:
+        return "bullets"
+    return None
+def _embedded_comparison_frame(m: str) -> str | None:
+    """``pros_cons`` vs ``narrative`` comparison layout (not short *Use pros and cons* controls)."""
+    if len(m) < 48:
+        return None
+    if not re.search(r"\b(compare|comparing|comparison|contrasted?|contrast|trade-?offs?)\b", m):
+        return None
+    narrative = bool(
+        re.search(
+            r"\b(no pros|without pros|avoid pros|no pros\/cons|no pros and cons sections?)\b",
+            m,
+        )
+        or re.search(
+            r"\b(flowing prose|continuous prose|narrative comparison|prose comparison only|"
+            r"compare in flowing prose)\b",
+            m,
+        )
+    )
+    pros = bool(
+        re.search(
+            r"\b(trade-?offs?|(?<!no )pros and cons|advantages and disadvantages)\b",
+            m,
+        )
+        or re.search(r"\bdifference between\b.+\band\b", m)
+        or re.search(r"\b(compare|comparing|comparison|contrasted?|contrast)\b.+\b(vs\.?|versus)\b", m)
+        or (
+            re.search(r"\b(compare|comparing|comparison)\b", m)
+            and re.search(r"\b(and|with)\b", m)
+            and len(m) >= 72
+            and re.search(
+                r"\b(versus|vs\.?|option|approach|tool|stack|framework|language|model|database|db|cloud)\b",
+                m,
+            )
+        )
+    )
+    if narrative and pros:
+        return None
+    if narrative:
+        return "narrative"
+    if pros:
+        return "pros_cons"
+    return None
+def _embedded_step_style(m: str) -> str | None:
+    """``numbered`` vs ``continuous`` procedure layout (not short *Step by step* controls)."""
+    if len(m) < 48:
+        return None
+    continuous = bool(
+        re.search(
+            r"\b(no numbered steps|don'?t number steps|skip step numbers|"
+            r"prose without steps|avoid numbered step lists?|"
+            r"continuous prose (?:only|instead)|not as numbered steps|"
+            r"connected paragraphs?(?:\s+only)?|explain as (?:flowing )?prose)\b",
+            m,
+        )
+    )
+    numbered = bool(
+        re.search(r"\b(step by step|step-by-step)\b", m)
+        or re.search(r"\b(walk me through|show me how)\b", m)
+        or re.search(r"\b(?<!no )(?:use )?numbered steps\b|break it into steps\b", m)
+        or (
+            re.search(r"\b(how do i|how can i|how should i|how to)\b", m)
+            and re.search(
+                r"\b(install|set up|setup|configure|enable|deploy|migrate|upgrade|fix|debug|troubleshoot)\b",
+                m,
+            )
+        )
+    )
+    if continuous and numbered:
+        return None
+    if continuous:
+        return "continuous"
+    if numbered:
+        return "numbered"
+    return None
+def _reply_lang_phrase(m: str) -> str | None:
+    """Return display name (e.g. 'French') if the user asked for a reply in a known language."""
+    for mo in re.finditer(
+        r"\b(respond|answer|reply|write|explain)\s+(?:in|using)\s+([a-z]{3,20})\b(?:\s*[.?!]|$|,|\s+please|\s+thanks)?",
+        m,
+    ):
+        tok = mo.group(2)
+        if tok in _REPLY_LANG_TOKENS:
+            return _REPLY_LANG_TOKENS[tok]
+    mo = re.search(
+        r"\b(translate|translating)\s+(?:this|that|it|your answer|the above|my text)\s+(?:to|into)\s+([a-z]{3,20})\b",
+        m,
+    )
+    if mo and mo.group(2) in _REPLY_LANG_TOKENS:
+        return _REPLY_LANG_TOKENS[mo.group(2)]
+    mo = re.search(r"\b(entire reply|whole answer|full answer)\s+(?:in|using)\s+([a-z]{3,20})\b", m)
+    if mo and mo.group(2) in _REPLY_LANG_TOKENS:
+        return _REPLY_LANG_TOKENS[mo.group(2)]
+    # Trailing clause: "... in french" / "... in spanish, please"
+    tail = m[-100:] if len(m) > 100 else m
+    mo = re.search(r"\b(in|into)\s+([a-z]{3,20})\s*(?:[,.]|please|thanks)?\s*$", tail)
+    if mo and mo.group(2) in _REPLY_LANG_TOKENS:
+        return _REPLY_LANG_TOKENS[mo.group(2)]
+    return None
+def analyze_embedded_prompt_signals(message: str) -> tuple[dict[str, str], list[str], list[str]]:
+    """Infer reply-style preferences from wording inside longer questions (one-shot overlays).
+    Used only when ``parse_control_action`` does not treat the line as a dedicated control
+    command. Conservative patterns avoid hijacking short chit-chat.
+    Returns:
+        (field_overrides, extra_system_paragraphs, trace_tags) — overrides use the same keys/values as
+        ``ub_session`` reply-style fields; extra paragraphs are appended as separate system sections;
+        ``trace_tags`` are short tokens for the brain-trace ``prompt_signals:`` line (e.g. ``language``,
+        ``code_only``, ``len_cap=80w``, ``guided``, ``ephemeral``, ``a11y``). Session-style overrides
+        (e.g. ``confidence_tone=transparent``) appear as ``key=value`` tokens in the same line.
+    """
+    m = _norm(message)
+    overrides: dict[str, str] = {}
+    extras: list[str] = []
+    trace_tags: list[str] = []
+    if len(m) >= 24:
+        lang = _reply_lang_phrase(m)
+        if lang:
+            trace_tags.append("language")
+            extras.append(
+                f"The user asked (via natural wording) for the assistant reply in **{lang}**. "
+                f"Write the **entire** answer in {lang}, including headings and lists, unless a quoted passage must stay "
+                "verbatim in another language."
+            )
+    co = _code_only_instruction(m)
+    if co:
+        trace_tags.append("code_only")
+        extras.append(co)
+    lc = _length_cap_instruction(m)
+    if lc:
+        extras.append(lc[0])
+        trace_tags.append(lc[1])
+    gd = _guided_discovery_instruction(m)
+    if gd:
+        extras.append(gd[0])
+        trace_tags.append(gd[1])
+    ep = _ephemeral_privacy_instruction(m)
+    if ep:
+        extras.append(ep[0])
+        trace_tags.append(ep[1])
+    ax = _accessibility_sr_instruction(m)
+    if ax:
+        extras.append(ax[0])
+        trace_tags.append(ax[1])
+    if _embedded_simple_audience(m):
+        overrides["audience"] = "simple"
+    elif _embedded_technical_audience(m):
+        overrides["audience"] = "technical"
+    ert = _embedded_register_tone(m)
+    if ert:
+        overrides["register_tone"] = ert
+    if _embedded_json_output(m):
+        overrides["output_format"] = "json"
+    spc = _embedded_speculation(m)
+    if spc:
+        overrides["speculation"] = spc
+    ald = _embedded_answer_lead(m)
+    if ald:
+        overrides["answer_lead"] = ald
+    act = _embedded_actionability(m)
+    if act:
+        overrides["actionability"] = act
+    cft = _embedded_confidence_tone(m)
+    if cft:
+        overrides["confidence_tone"] = cft
+    exd = _embedded_example_density(m)
+    if exd:
+        overrides["example_density"] = exd
+    if len(m) < 48:
+        return overrides, extras, trace_tags
+    cmf = _embedded_comparison_frame(m)
+    if cmf:
+        overrides["comparison_frame"] = cmf
+    stl = _embedded_step_style(m)
+    if stl:
+        overrides["step_style"] = stl
+    # Tables when the user names the shape they want.
+    if re.search(r"\b(no tables?|without a table|avoid tables?)\b", m):
+        overrides["table_style"] = "avoid"
+    elif re.search(
+        r"\b(in a table|as a table|markdown table|tabular format|two-?column|rows and columns)\b",
+        m,
+    ):
+        overrides["table_style"] = "prefer"
+    rpf = _embedded_reply_format(m)
+    if rpf:
+        overrides["reply_format"] = rpf
+    mth = _embedded_math_detail(m)
+    if mth:
+        overrides["math_detail"] = mth
+    cpt = _embedded_counterpoint_tone(m)
+    if cpt:
+        overrides["counterpoint_tone"] = cpt
+    eord = _embedded_exposition_order(m)
+    if eord:
+        overrides["exposition_order"] = eord
+    fuc = _embedded_followup_close(m)
+    if fuc:
+        overrides["followup_close"] = fuc
+    clf = _embedded_clarify_first(m)
+    if clf:
+        overrides["clarify_first"] = clf
+    shd = _embedded_section_headings(m)
+    if shd:
+        overrides["section_headings"] = shd
+    anu = _embedded_analogy_use(m)
+    if anu:
+        overrides["analogy_use"] = anu
+    tem = _embedded_term_emphasis(m)
+    if tem:
+        overrides["term_emphasis"] = tem
+    acs = _embedded_acronym_style(m)
+    if acs:
+        overrides["acronym_style"] = acs
+    rsk = _embedded_risk_posture(m)
+    if rsk:
+        overrides["risk_posture"] = rsk
+    qst = _embedded_quote_style(m)
+    if qst:
+        overrides["quote_style"] = qst
+    emj = _embedded_emoji_style(m)
+    if emj:
+        overrides["emoji_style"] = emj
+    fgr = _embedded_faq_grounding(m)
+    if fgr:
+        overrides["faq_grounding"] = fgr
+    cbs = _embedded_code_block_style(m)
+    if cbs:
+        overrides["code_block_style"] = cbs
+    return overrides, extras, trace_tags

scripts/universal_brain_chat.py CHANGED Viewed

@@ -104,14 +104,15 @@ from horizon3_store import (  # noqa: E402
 from google_cse_client import (  # noqa: E402
     format_cse_hits_markdown,
     google_cse_search,
     read_google_cse_settings,
 )
-from nl_controls import parse_control_action  # noqa: E402
 from rag_faq_smoke import _pick_model, hybrid_retrieve, load_chunks  # noqa: E402
 from tinymodel_runtime import TinyModelRuntime  # noqa: E402
 HELP_TEXT = """**How to use**
-- **Normal language:** ask in plain English (or mixed); the app **infers** what you want (summarize, search FAQ, save a note, etc.).
 - **Session controls (say it in chat, no slash command):**
   - *What is my current scope?*, *Show my session settings* -> prints scope + toggles (FAQ context, routing, trace)
   - *Start a new private session*, *Begin a fresh scope* -> generates a **new memory scope key** so notes are isolated from the shared default demo scope
@@ -157,6 +158,7 @@ HELP_TEXT = """**How to use**
 - **Rewrite** professionally / rephrase
 - **Answer using only** these facts — include both facts and question
 - **Search** the FAQ / **find** in the knowledge base
 - **Classify** (topic model) this paragraph
 - **Similarity:** are these two snippets close in meaning? (encoder cosine)
 - **Embedding** stats for a passage (dimension, norm, preview)
@@ -172,7 +174,18 @@ If routing misfires, try rephrasing or use a slash command; **`--no-smart-route`
 # Shown under the chat + controls in the Gradio UI (Hugging Face Space and local).
 GRADIO_INSTRUCTIONS_MARKDOWN = """### About this Space
-**Universal Brain** pairs a small **generative** model with the **TinyModel1** encoder (AG News–style topics: World, Business, Sports, Sci/Tech), **FAQ retrieval** over a bundled corpus, **SQLite memory** scoped per session, and **natural-language routing** so many tasks work without slash commands. First CPU startup can take a few minutes while weights download.
 ---
@@ -184,6 +197,55 @@ GRADIO_INSTRUCTIONS_MARKDOWN = """### About this Space
 ---
 ### What to try (step-by-step)
 | Goal | What to type |
@@ -248,6 +310,7 @@ If you see an error about HTTP 403 or “API key not valid”, fix the key or en
 - Ensure **smart routing** is on (say *Turn on smart routing* if you turned it off).
 - Ask in plain language for **live web** / **Google** / **today’s** information, e.g. *Search the web for the latest SpaceX launch summary* or *What does the web say about …?*
 - The router uses intent **`web_search`**: the app fetches snippets, injects them into the model context, then the assistant replies **using those sources** (cite **[Web n]** when using a snippet).
 - If the model stays in FAQ-only mode, use **`/web …`** first to confirm the API works, then try clearer web phrasing.
 **6) Brain trace**
@@ -328,6 +391,7 @@ Rules:
 - Default to "chat" when unsure; copy the entire user message into "text".
 - Do not invent facts for "grounded": if no clear facts/context, use "chat" instead.
 - Use **retrieve** for bundled FAQ / help-base search; use **web_search** when the user clearly needs the **public web** (today, external site, breaking news, "google this", etc.).
 - Extract minimal "text" for tool intents (do not repeat system chatter)."""
 VALID_INTENTS = frozenset(
@@ -1224,6 +1288,10 @@ def _append_reply_style_hints(extras: list[str], session: dict[str, Any]) -> Non
         lines.append("Prefer fuller, well-structured explanations when they help the user.")
     if rformat == "bullets":
         lines.append("When listing multiple points, use markdown bullet or numbered lists.")
     audience = str(session.get("audience") or "normal").lower()
     if audience not in ("simple", "normal", "technical"):
         audience = "normal"
@@ -1773,6 +1841,11 @@ def parse_args() -> argparse.Namespace:
         action="store_true",
         help="Disable NL intent routing (plain chat only; slash commands still work).",
     )
     p.add_argument(
         "--router-max-new-tokens",
         type=int,
@@ -1786,6 +1859,8 @@ def parse_args() -> argparse.Namespace:
 def main() -> None:
     args = parse_args()
     _load_dotenv_if_present(_REPO)
     _gk, _gc, _, _ = read_google_cse_settings()
     cse_on = bool(_gk and _gc)
     _ensure_gradio_can_reach_localhost()
@@ -1968,10 +2043,28 @@ def main() -> None:
             except Exception:
                 route = {"intent": "chat", "text": msg, "question": "", "context": ""}
             if route["intent"] == "web_search":
                 g_key, g_cx, g_num, g_safe = read_google_cse_settings()
                 q_web = (route["text"] or msg).strip()
-                web_trace = "web:CSE:cfg"
                 if g_key and g_cx and q_web:
                     try:
                         hits = google_cse_search(
@@ -1982,16 +2075,16 @@ def main() -> None:
                             safe=g_safe,
                         )
                         web_block = format_cse_hits_markdown(hits, for_chat=True)
-                        web_trace = f"web:CSE:{len(hits)}"
                     except Exception as ex:
                         web_block = (
                             f"(Google web search failed: {_clip(str(ex), 500)})\n\n"
                             "Answer from general knowledge where appropriate; do not invent URLs or page titles."
                         )
-                        web_trace = "web:CSE:err"
                 elif not q_web:
                     web_block = "(Empty web search query. Ask again with a concrete search topic.)"
-                    web_trace = "web:CSE:empty"
                 else:
                     web_block = (
                         "(Web search is not configured: set **GOOGLE_CSE_API_KEY** and **GOOGLE_CSE_CX** "
@@ -2024,9 +2117,19 @@ def main() -> None:
             chat_line = route["text"] or msg
         trace: list[str] = []
         extras: list[str] = []
-        _append_reply_style_hints(extras, ub_session)
         if web_trace:
             trace.append(web_trace)
@@ -2096,6 +2199,7 @@ def main() -> None:
                 or mem_conn is not None
                 or effective_rag is not None
                 or bool(web_trace)
             )
         )
         if show_trace_footer and trace:

 from google_cse_client import (  # noqa: E402
     format_cse_hits_markdown,
     google_cse_search,
+    heuristic_suggests_web_search,
     read_google_cse_settings,
 )
+from nl_controls import analyze_embedded_prompt_signals, parse_control_action  # noqa: E402
 from rag_faq_smoke import _pick_model, hybrid_retrieve, load_chunks  # noqa: E402
 from tinymodel_runtime import TinyModelRuntime  # noqa: E402
 HELP_TEXT = """**How to use**
+- **Normal language:** ask in plain English (or mixed); the app **infers** what you want (summarize, search FAQ, save a note, etc.). Longer prompts may also **imply** reply shape for that turn only (for example trade-off questions → Pros/Cons layout or flowing prose comparison, “in a table” → markdown table preference, “answer in Spanish” → reply language, **code only** → code-first output, **in under N words** → length cap, **hints only / don’t give the full solution** → guided discovery, **red team / sanity check my plan** → challenge-style pushback, **be supportive / assume good intent on my plan** → supportive coaching, **don’t remember this / off the record** → ephemeral hint, **screen reader friendly / WCAG** → accessibility layout hint, **ELI5 / lay audience in a long question** → beginner audience, **assume I'm technical / expert depth in prose** → technical audience, **board-ready / Slack-casual wording** → formal or casual register, **valid JSON / return JSON in prose** → JSON output mode, **don’t guess / stick to facts in prose** → strict speculation, **brainstorm freely / wild ideas in prose** → creative speculation, **TLDR first / BLUF in prose** → summary-first open, **answer directly / skip the summary in prose** → direct opening, **FAQ direct quotes vs paraphrase-only in prose** → quote style for excerpts, **emoji ok vs no emoji in prose** → emoji style, **FAQ-only vs FAQ-plus-general-knowledge in prose** → FAQ grounding, **show work vs final-answer-only in prose** → math detailing, **state assumptions / limitations / caveats** in prose → transparent confidence tone, **be decisive / don’t hedge in prose** → assertive confidence tone, **curl/bash/kubectl in prose** → runnable commands, **conceptual only / no commands in prose** → conceptual actionability, **bullet points vs plain paragraphs in prose** → reply format, **step-by-step vs continuous procedure prose in long prompts** → step style, **concrete / worked / toy example in prose** → richer examples, **example-free / skip examples in prose** → sparser examples, **define terms first / intuition or big-picture first in prose** → explanation order, **no questions at the end / suggest next steps in prose** → closing style, **ask questions before answering / answer without clarifiers in prose** → clarify-first mode, **markdown section headings vs flat prose in long prompts** → section layout, **analogy vs literal-only in long prompts** → analogy style, **bold key terms vs minimal bold in long prompts** → term emphasis, **spell out acronyms vs terse acronyms in long prompts** → acronym style, **err on the side of safety vs ship-fast pragmatism in long prompts** → risk posture, **fenced code blocks vs inline-only snippets in long prompts** → code block style) — see *Brain trace* **`prompt_signals:`** when detected.
 - **Session controls (say it in chat, no slash command):**
   - *What is my current scope?*, *Show my session settings* -> prints scope + toggles (FAQ context, routing, trace)
   - *Start a new private session*, *Begin a fresh scope* -> generates a **new memory scope key** so notes are isolated from the shared default demo scope
 - **Rewrite** professionally / rephrase
 - **Answer using only** these facts — include both facts and question
 - **Search** the FAQ / **find** in the knowledge base
+- **Live web** (news, prices, “latest …”, fact-checking) — router uses **web_search**; with Google CSE configured, the server may also **auto-run** web search when your wording implies it (see brain trace **`+auto`**). Disable with **`--no-auto-web`** or env **`NO_AUTO_WEB=1`** on your own deployment.
 - **Classify** (topic model) this paragraph
 - **Similarity:** are these two snippets close in meaning? (encoder cosine)
 - **Embedding** stats for a passage (dimension, norm, preview)
 # Shown under the chat + controls in the Gradio UI (Hugging Face Space and local).
 GRADIO_INSTRUCTIONS_MARKDOWN = """### About this Space
+**Universal Brain** is a **text** assistant built from: (1) a **small generative instruct model** (default in code: **SmolLM2-360M-Instruct**, overridable via **`HORIZON2_MODEL`**), (2) the **TinyModel1** encoder for **4-way topics** (World, Business, Sports, Sci/Tech) plus **embeddings**, (3) **FAQ hybrid retrieval** over a bundled corpus, (4) **SQLite memory** per session scope, (5) a **JSON intent router** so many tasks work in plain language, (6) optional **Google web search** when `GOOGLE_CSE_API_KEY` + `GOOGLE_CSE_CX` are set (`/web`, smart-route **web search**, optional **auto-web**). First CPU startup can take a few minutes while weights download.
+#### What it can do (summary)
+| Area | Capacity |
+| --- | --- |
+| **Chat & tools** | Summarize, rewrite, grounded Q&A (`|||` facts), FAQ search, **live web** (if configured), classify, similarity, embeddings, nearest-option, **/status**, memory CRUD — via natural language or **`/…`** shortcuts. |
+| **Encoder** | Soft **topic hint** + trace line **`classify:…`**; **`/classify`** for full label probabilities. |
+| **RAG** | Injects top FAQ **chunks**; tune strictness with phrases like *Strict FAQ* (see `/help`). |
+| **Memory** | Long-term + session notes; **scope** isolation phrases for demos; export / forget from chat. |
+| **Style (long prompts)** | **Embedded prompt signals** adjust **one turn** from wording (tradeoffs, tables, steps, language, code-only, caps, hints-only, red-team, ephemeral, a11y, ELI5-style, register, JSON, strict facts, BLUF, shell commands, assumptions, examples, teach-order, closers, clarify-first, headings, analogies) — see **`prompt_signals:`** in the *Brain trace* and the table **Testing embedded prompt signals** below. |
+| **Limits** | Small models can **hallucinate** or miss nuance; FAQ/web only **constrain** answers when relevant snippets exist. **Not multimodal** here. Shared default **memory scope** is not private auth. |
 ---
 ---
+### Testing embedded prompt signals (this Space)
+These behaviors apply when your line is handled as **normal chat** (not a short dedicated control like *Be brief*). The app scans your wording and adds **one-turn** system hints. Say **Show the brain trace** first, send a message, then scroll to the bottom of the assistant reply.
+| Goal | What to type (examples) | What to look for |
+| --- | --- | --- |
+| Comparison: pros/cons | In a **long** message, ask for **tradeoffs**, **pros and cons**, **compare X vs Y**, or **advantages and disadvantages** between concrete options (avoid mixing with **no pros and cons** / **flowing prose comparison** in the same line). | **`comparison_frame=pros_cons`** in **`prompt_signals:`**; reply should use **Pros** / **Cons** sections |
+| Comparison: narrative prose | In a **long** comparison question, ask for **flowing prose**, **narrative comparison**, **prose comparison only**, or **no pros and cons sections** (avoid mixing with **pros and cons** / **tradeoffs** layout cues in the same line). | **`comparison_frame=narrative`** in **`prompt_signals:`**; reply should weave the comparison in continuous prose |
+| Length cap | End your question with **in under 80 words** or **at most 3 sentences**. | **`len_cap=80w`** or **`len_cap=3s`** in **`prompt_signals:`**; the model should stay near that cap |
+| Code-only | Ask for a tiny snippet and add **code only, no explanation** (or **just the code**). | **`code_only`** in **`prompt_signals:`**; reply should be mostly a fenced code block |
+| Tables / language | Ask for a summary **in a markdown table**, or **answer in spanish** in the same line as your question. | **`table_style=prefer`** and/or **`language`** in **`prompt_signals:`** |
+| Numbered steps vs continuous prose | In a **long** how-to message, ask **step by step**, **walk me through**, **numbered steps**, or a **how to install/configure** style question. **Or** say **no numbered steps**, **continuous prose only**, **prose without steps**, **explain as connected paragraphs** (avoid mixing both in one line). | **`step_style=numbered`** or **`step_style=continuous`** in **`prompt_signals:`**; reply should use numbered steps or flowing prose accordingly |
+| Bullets vs prose | In a **long** message, ask for **bullet points**, **use bullets**, **bulleted list**, **format as bullets**, etc. **Or** say **no bullets**, **plain paragraphs**, **prose only**, **avoid bullet lists** (avoid mixing both in one line). | **`reply_format=bullets`** or **`reply_format=prose`** in **`prompt_signals:`**; reply should list points or stay in paragraphs accordingly |
+| Guided discovery (hints / Socratic) | Ask a **how / why** question and say you want **hints only** or **don’t give me the full solution yet** (keep the message substantive, not a one-liner control). | **`guided`** in **`prompt_signals:`**; first reply should skew toward questions and nudges |
+| Red-team / critique | In one paragraph, describe a **plan or design** and ask for a **red team**, **sanity check**, **what am I missing**, or **devil’s advocate** review (not a one-line control). | **`counterpoint_tone=challenge`** inside **`prompt_signals:`**; reply should stress-test assumptions |
+| Supportive coaching | In one paragraph, describe a **plan, pitch, or idea** and ask to **be supportive**, **assume good intent**, **encourage my proposal**, **gentle feedback**, or **avoid harsh criticism** (not a one-line control; avoid mixing with red-team wording in the same line). | **`counterpoint_tone=supportive`** in **`prompt_signals:`**; reply should coach with constructive next steps, not harsh critique |
+| Ephemeral / no memory | Say **off the record**, **don’t remember this**, **no memory for this**, or **don’t log this** in the same message as your question (demo: shared Space scopes are not true secrecy). | **`ephemeral`** in **`prompt_signals:`**; assistant should avoid pushing `/remember` for that content |
+| Accessibility / screen readers | Ask for a **screen reader friendly** or **WCAG-aware** answer, or say the write-up is **for blind readers** / **for NVDA users** in a full sentence (not a one-word ping). | **`a11y`** in **`prompt_signals:`**; reply should favor linear structure, headings, and non-table-only facts |
+| Beginner / ELI5 in context | In a **longer** question (not a one-line control), ask for **ELI5**, **explain like I'm five**, **total beginner**, **lay audience**, **no technical background**, etc., plus a normal **what/why/how** ask. | **`audience=simple`** in **`prompt_signals:`**; reply should use plain language and minimal jargon |
+| Technical / expert audience | In a **longer** question (not a one-line control), say you're a **technical audience**, **assume I'm technical**, want a **deep technical** or **internals-focused** explanation, **skip the basics**, **staff-engineer level**, etc., plus a normal **what/why/how** ask (avoid mixing with ELI5/beginner wording in the same line). | **`audience=technical`** in **`prompt_signals:`**; reply may use domain jargon and skip hand-holding |
+| Formal vs casual register | Ask for a **board-ready** / **client-facing** / **formal memo** / **for regulators** write-up, **or** say you want a **Slack message**, **keep it casual**, **water cooler** tone (one dominant style per message). | **`register_tone=formal`** or **`register_tone=casual`** in **`prompt_signals:`** |
+| JSON / structured output | In a **long** message, ask for **valid JSON**, **return JSON**, **as a JSON object**, **machine-readable JSON**, etc. (say **no json** / **plain text only** if you want to opt out). | **`output_format=json`** in **`prompt_signals:`**; reply should be parseable JSON when practical |
+| Strict facts / low speculation | In a **long** message, ask to **not guess**, **avoid hallucinations**, **only high confidence**, **stick to facts**, **if unsure say so**, etc. (avoid mixing with **brainstorm freely** in the same line). | **`speculation=strict`** in **`prompt_signals:`**; reply should label uncertainty clearly |
+| Creative brainstorming | In a **long** message, ask to **brainstorm freely**, **speculate freely**, welcome **wild ideas**, do **blue-sky thinking**, or **explore hypotheticals** (avoid mixing with **don’t guess** / **stick to facts** in the same line). | **`speculation=creative`** in **`prompt_signals:`**; reply may propose speculative ideas with clear assumption labels |
+| Summary / BLUF first | In a **long** message, ask to **TLDR first**, **lead with a one-line summary**, **bottom line up front**, **BLUF**, **executive summary first**, etc. (avoid mixing with **answer directly** / **skip the summary** in the same line). | **`answer_lead=tldr_first`** in **`prompt_signals:`**; reply should open with a short summary line |
+| Direct answer (no TL;DR) | In a **long** message, ask to **answer directly**, **skip the summary**, **no TL;DR**, **jump straight to the answer**, or **omit the opening summary** (avoid mixing with **BLUF** / **summary first** in the same line). | **`answer_lead=direct`** in **`prompt_signals:`**; reply should start in-flow without a standalone TL;DR prelude |
+| Runnable commands | In a **long** message, ask for **curl one-liner**, **bash snippet**, **kubectl**, **copy-paste into terminal**, **docker run example**, etc. (avoid mixing with **conceptual only** / **no commands** in the same line). | **`actionability=commands`** in **`prompt_signals:`**; reply should include concrete commands where sensible |
+| Conceptual only (no commands) | In a **long** message, ask for **conceptual only**, **high level only**, **no shell commands**, **focus on concepts and rationale**, or an **architecture overview without command dumps** (avoid mixing with **kubectl** / **copy-paste into terminal** in the same line). | **`actionability=conceptual`** in **`prompt_signals:`**; reply should avoid runnable command dumps |
+| Assumptions / limitations | In a **long** message, ask to **state your assumptions**, **assumptions and limitations**, **caveats upfront**, **scope and assumptions**, **what we are assuming**, or to **flag key uncertainties** (say **skip assumptions** to opt out; avoid mixing with **be decisive** in the same line). | **`confidence_tone=transparent`** in **`prompt_signals:`**; reply should surface assumptions, limits, and uncertainty clearly |
+| Decisive / confident tone | In a **long** message, ask to **be decisive**, **don’t hedge**, **give firm answers**, **sound confident**, or **avoid disclaimers** (avoid mixing with **state your assumptions** / **caveats upfront** in the same line). | **`confidence_tone=assertive`** in **`prompt_signals:`**; reply should be direct with minimal hedging |
+| Concrete examples vs example-free | In a **long** message, ask for a **worked example**, **walk me through a toy example**, **illustrate with a concrete example**, **ground your answer in an example**, etc. **Or** ask to **skip examples**, **theory only**, **keep it abstract**, **example-free** (avoid mixing both in one line). | **`example_density=rich`** or **`example_density=sparse`** in **`prompt_signals:`**; reply should include or omit short illustrative examples accordingly |
+| Explanation order | In a **long** message, ask to **define terms first**, **definitions before details**, **formal definitions upfront**, **terminology first**, etc. **Or** ask for **intuition before math**, **big picture first**, **motivation before the formal proof**, **start with the high-level sketch** (avoid asking for both orders in one line). | **`exposition_order=definitions_first`** or **`exposition_order=intuition_first`** in **`prompt_signals:`**; reply should lead with definitions or with intuition accordingly |
+| Closing / follow-ups | In a **long** message, ask for **no questions at the end**, **don’t ask if I need anything else**, **finish crisply**, **skip the stock closer**, etc. **Or** ask to **suggest next steps**, **end with actionable next steps**, **what should we do next**, **offer ways to go deeper** (avoid mixing both in one line). | **`followup_close=minimal`** or **`followup_close=suggest`** in **`prompt_signals:`**; reply should omit or include a light optional follow-up line accordingly |
+| Clarify-first vs answer-first | In a **long** message, ask to **ask clarifying questions before you answer**, **if anything is unclear ask me first**, **confirm my constraints before**, etc. **Or** say **no clarifying questions**, **answer without asking questions first**, **don’t interrogate me first**, **give your best answer without asking** (avoid mixing both in one line). | **`clarify_first=on`** or **`clarify_first=off`** in **`prompt_signals:`**; first reply should ask brief questions first or answer directly |
+| Section headings vs flat | In a **long** message, ask to **use markdown headings**, **organize with headings**, **structure the answer with clear headings**, **h2 or h3 headings for each topic**, etc. **Or** ask for a **flat answer**, **no section headings**, **avoid markdown headings**, **continuous prose only** (avoid mixing both in one line). | **`section_headings=prefer`** or **`section_headings=avoid`** in **`prompt_signals:`**; reply should use or avoid `##` / `###` title lines accordingly |
+| Analogies vs literal | In a **long** message, ask to **use a helpful analogy**, **explain with a simple analogy**, **liken this to something familiar**, **map it to an everyday example**, etc. **Or** say **no analogies**, **skip metaphors**, **literal explanations only**, **stick to literal technical description** (avoid mixing both in one line). | **`analogy_use=prefer`** or **`analogy_use=avoid`** in **`prompt_signals:`**; reply may include one tight analogy or stay metaphor-free accordingly |
+| Bold key terms vs minimal bold | In a **long** message, ask to **bold the key terms**, **highlight important phrases**, **make key terms stand out** for scanning, etc. **Or** say **minimal bold**, **don’t overuse bold**, **avoid excessive bold**, **sparse bold** (avoid mixing both in one line). | **`term_emphasis=highlight`** or **`term_emphasis=minimal`** in **`prompt_signals:`**; reply should use selective **bold** on keywords or keep bold sparse |
+| Acronym expansion vs terse | In a **long** message, ask to **spell out acronyms**, **expand acronyms on first use**, **define acronyms when you introduce them** (e.g. for compliance readers). **Or** say **assume I know acronyms**, **don’t expand acronyms**, **keep acronyms as-is**, **acronym-literate audience** (avoid mixing both in one line). | **`acronym_style=spell_out`** or **`acronym_style=terse`** in **`prompt_signals:`**; reply should expand once as `Long Form (ACRONYM)` or reuse acronyms without expansion |
+| Risk posture (safe vs pragmatic) | In a **long** message, ask to **err on the side of safety**, **minimize downside**, **prefer low-risk options**, **safety-first rollout**, etc. **Or** say **optimize for speed**, **be pragmatic**, **avoid over-engineering**, **good enough is fine**, **ship fast** (avoid mixing both in one line). | **`risk_posture=conservative`** or **`risk_posture=pragmatic`** in **`prompt_signals:`**; recommendations should favor safety or practical speed accordingly |
+| FAQ quote vs paraphrase | In a **long** message about **FAQ / policy / excerpt** text, ask to **quote the FAQ excerpts**, **include direct quotes from the policy**, **verbatim passages from the excerpt**, etc. **Or** say **paraphrase the FAQ**, **paraphrase only**, **don’t quote the excerpts**, **summarize the policy in your own words** (avoid mixing both in one line). | **`quote_style=quote`** or **`quote_style=paraphrase`** in **`prompt_signals:`**; reply should quote or paraphrase injected excerpts accordingly |
+| Emoji in replies | In a **long** message, ask to **use a few tasteful emoji**, **include emoji when helpful**, **emoji are ok**, **sprinkle emoji**, etc. **Or** say **no emoji in your reply**, **avoid emoji**, **emoji-free tone**, **don’t use emoji** (avoid mixing both in one line). | **`emoji_style=include`** or **`emoji_style=avoid`** in **`prompt_signals:`**; reply may use sparse emoji or stay emoji-free accordingly |
+| FAQ grounding (strict vs relaxed) | In a **long** message about **FAQ / policy / excerpt** retrieval, ask to **stick to the FAQ**, **only use the FAQ excerpts**, **if it’s not in the FAQ say so**, **strict FAQ grounding**, etc. **Or** say **FAQ plus general knowledge**, **mix the FAQ with general knowledge**, **supplement the excerpts with brief general context** (avoid mixing both in one line). | **`faq_grounding=strict`** or **`faq_grounding=relaxed`** in **`prompt_signals:`**; reply should stay FAQ-only or allow separated general context accordingly |
+| Math steps vs final only | In a **long** math-style question, ask to **show your work**, **walk through the derivation**, **prove it step by step**, **show intermediate steps**, etc. **Or** say **final answer only**, **no derivation**, **skip the steps**, **just the result** for the equation (avoid mixing both in one line). | **`math_detail=show_work`** or **`math_detail=final_only`** in **`prompt_signals:`**; reply should include or omit intermediate math steps accordingly |
+| Code fences vs inline | In a **long** message that includes **code / commands / scripts**, ask for **fenced code blocks**, **markdown code fences**, **triple-backtick fences**, etc. **Or** say **inline code only**, **no triple backticks**, **no fenced code blocks**, **keep snippets inline** (avoid mixing both in one line). | **`code_block_style=fenced`** or **`code_block_style=inline`** in **`prompt_signals:`**; reply should use ``` fences or inline backticks accordingly |
+If there is no footer, brain trace is off for that session, or this deployment has **no** encoder / FAQ / memory / web layers and no prompt signals fired yet—**prompt signals alone** still turn the footer on once this feature triggers.
+---
 ### What to try (step-by-step)
 | Goal | What to type |
 - Ensure **smart routing** is on (say *Turn on smart routing* if you turned it off).
 - Ask in plain language for **live web** / **Google** / **today’s** information, e.g. *Search the web for the latest SpaceX launch summary* or *What does the web say about …?*
 - The router uses intent **`web_search`**: the app fetches snippets, injects them into the model context, then the assistant replies **using those sources** (cite **[Web n]** when using a snippet).
+- **Automatic web:** if Google CSE is configured, the app may also run a web search when your message **implies** fresh public facts (e.g. *latest*, *today*, *who won*, *stock price*, a recent year + question) even if you do not say “search the web”. On a self-hosted Space you can disable that with **`--no-auto-web`** or env **`NO_AUTO_WEB=1`**. Brain trace may show **`+auto`** on the web line when the upgrade came from this layer rather than the router alone.
 - If the model stays in FAQ-only mode, use **`/web …`** first to confirm the API works, then try clearer web phrasing.
 **6) Brain trace**
 - Default to "chat" when unsure; copy the entire user message into "text".
 - Do not invent facts for "grounded": if no clear facts/context, use "chat" instead.
 - Use **retrieve** for bundled FAQ / help-base search; use **web_search** when the user clearly needs the **public web** (today, external site, breaking news, "google this", etc.).
+- **web_search vs chat (critical):** choose **web_search** when a good answer depends on **recent events**, **live or site-specific data** (prices, sports scores, releases after your knowledge cutoff, "what happened today", laws/regulations that change), **verifying a claim against the public web**, or **finding an official URL**. Choose **chat** for timeless explanations, coding how-to without needing today's docs, brainstorming, role-play, or personal opinion where web snippets would not change the answer.
 - Extract minimal "text" for tool intents (do not repeat system chatter)."""
 VALID_INTENTS = frozenset(
         lines.append("Prefer fuller, well-structured explanations when they help the user.")
     if rformat == "bullets":
         lines.append("When listing multiple points, use markdown bullet or numbered lists.")
+    elif rformat == "prose":
+        lines.append(
+            "Prefer continuous paragraphs over bullet lists unless a very short list is clearer."
+        )
     audience = str(session.get("audience") or "normal").lower()
     if audience not in ("simple", "normal", "technical"):
         audience = "normal"
         action="store_true",
         help="Disable NL intent routing (plain chat only; slash commands still work).",
     )
+    p.add_argument(
+        "--no-auto-web",
+        action="store_true",
+        help="Disable chat→web_search heuristic (only explicit router web_search or /web uses Google CSE).",
+    )
     p.add_argument(
         "--router-max-new-tokens",
         type=int,
 def main() -> None:
     args = parse_args()
     _load_dotenv_if_present(_REPO)
+    if os.environ.get("NO_AUTO_WEB", "").strip().lower() in ("1", "true", "yes", "on"):
+        args.no_auto_web = True
     _gk, _gc, _, _ = read_google_cse_settings()
     cse_on = bool(_gk and _gc)
     _ensure_gradio_can_reach_localhost()
             except Exception:
                 route = {"intent": "chat", "text": msg, "question": "", "context": ""}
+            g_key, g_cx, _, _ = read_google_cse_settings()
+            web_from_auto = False
+            if (
+                not args.no_auto_web
+                and route["intent"] == "chat"
+                and g_key
+                and g_cx
+                and heuristic_suggests_web_search(msg)
+            ):
+                route = {
+                    "intent": "web_search",
+                    "text": msg,
+                    "question": "",
+                    "context": "",
+                }
+                web_from_auto = True
             if route["intent"] == "web_search":
                 g_key, g_cx, g_num, g_safe = read_google_cse_settings()
                 q_web = (route["text"] or msg).strip()
+                _as = "+auto" if web_from_auto else ""
+                web_trace = f"web:CSE:cfg{_as}"
                 if g_key and g_cx and q_web:
                     try:
                         hits = google_cse_search(
                             safe=g_safe,
                         )
                         web_block = format_cse_hits_markdown(hits, for_chat=True)
+                        web_trace = f"web:CSE:{len(hits)}{_as}"
                     except Exception as ex:
                         web_block = (
                             f"(Google web search failed: {_clip(str(ex), 500)})\n\n"
                             "Answer from general knowledge where appropriate; do not invent URLs or page titles."
                         )
+                        web_trace = f"web:CSE:err{_as}"
                 elif not q_web:
                     web_block = "(Empty web search query. Ask again with a concrete search topic.)"
+                    web_trace = f"web:CSE:empty{_as}"
                 else:
                     web_block = (
                         "(Web search is not configured: set **GOOGLE_CSE_API_KEY** and **GOOGLE_CSE_CX** "
             chat_line = route["text"] or msg
+        sig_overrides, sig_extras, sig_trace_tags = analyze_embedded_prompt_signals(msg)
+        eff_session = dict(ub_session)
+        eff_session.update(sig_overrides)
         trace: list[str] = []
+        prompt_sig_active = bool(sig_overrides or sig_extras or sig_trace_tags)
+        if prompt_sig_active:
+            bits = [f"{k}={v}" for k, v in sorted(sig_overrides.items())]
+            bits.extend(sig_trace_tags)
+            trace.append("prompt_signals:" + "+".join(bits))
         extras: list[str] = []
+        _append_reply_style_hints(extras, eff_session)
+        for para in sig_extras:
+            extras.append(para)
         if web_trace:
             trace.append(web_trace)
                 or mem_conn is not None
                 or effective_rag is not None
                 or bool(web_trace)
+                or prompt_sig_active
             )
         )
         if show_trace_footer and trace: