anriltine commited on
Commit
5406f45
·
verified ·
1 Parent(s): 32907e2

Deploy TinyModel1Space from GitHub Actions

Browse files
README.md CHANGED
@@ -11,7 +11,17 @@ pinned: false
11
 
12
  # TinyModel1Space
13
 
14
- **Universal Brain chat** — natural-language routing, generative replies, FAQ RAG, SQLite memory, and TinyModel classification. Uses encoder **`HyperlinksSpace/TinyModel1`** (see also [model card](https://huggingface.co/HyperlinksSpace/TinyModel1)).
 
 
 
 
 
 
 
 
 
 
15
 
16
  - **App URL:** [https://hyperlinksspace-tinymodel1space.hf.space](https://hyperlinksspace-tinymodel1space.hf.space)
17
  - **Source:** [https://github.com/HyperlinksSpace/TinyModel](https://github.com/HyperlinksSpace/TinyModel)
@@ -27,6 +37,7 @@ pinned: false
27
  - `GOOGLE_CSE_CX` — Programmable Search Engine id (`cx` from the [PSE control panel](https://programmablesearchengine.google.com/controlpanel/all)); required together with `GOOGLE_CSE_API_KEY` for web search.
28
  - `GOOGLE_CSE_NUM` — results per request, 1–10 (default 5).
29
  - `GOOGLE_CSE_SAFE` — optional safe-search level for Google `cse.list` (e.g. `off`, `active`).
 
30
 
31
  ### Note
32
 
 
11
 
12
  # TinyModel1Space
13
 
14
+ **Universal Brain chat** — natural-language routing, a **small generative instruct model** (default in code: **SmolLM2-360M-Instruct**, override with **`HORIZON2_MODEL`**), **TinyModel1** encoder (**HyperlinksSpace/TinyModel1**), **FAQ hybrid RAG**, **SQLite memory**, optional **Google web search**, and **embedded prompt signals** (one-turn reply shape from long natural wording — see in-app **Testing embedded prompt signals**). See the [model card](https://huggingface.co/HyperlinksSpace/TinyModel1) for classifier weights.
15
+
16
+ | Capability | Notes |
17
+ | --- | --- |
18
+ | **Generation** | Chat + composed answers from the causal instruct LM (quality/size depends on **`HORIZON2_MODEL`** and CPU/GPU). |
19
+ | **Encoder** | 4-way **AG News–style** labels; **embeddings** for FAQ retrieval, **`/similarity`**, **`/nearest`**. |
20
+ | **FAQ RAG** | Bundled corpus; toggle injection from chat; grounding phrases in `/help`. |
21
+ | **Memory** | Scoped SQLite notes; use *Start a new private session* / *Switch to scope …* on shared demos. |
22
+ | **Routing** | JSON router for summarize / reformulate / grounded / retrieve / **web_search** / classify / similarity / embedding / nearest / memory / status. |
23
+ | **Web** | Requires **`GOOGLE_CSE_API_KEY`** + **`GOOGLE_CSE_CX`**; optional **auto-web** (disable with **`NO_AUTO_WEB`**). |
24
+ | **NL + embedded** | Short **session control** phrases + long-message **`prompt_signals:`** style overlays (see Gradio copy under the chat). |
25
 
26
  - **App URL:** [https://hyperlinksspace-tinymodel1space.hf.space](https://hyperlinksspace-tinymodel1space.hf.space)
27
  - **Source:** [https://github.com/HyperlinksSpace/TinyModel](https://github.com/HyperlinksSpace/TinyModel)
 
37
  - `GOOGLE_CSE_CX` — Programmable Search Engine id (`cx` from the [PSE control panel](https://programmablesearchengine.google.com/controlpanel/all)); required together with `GOOGLE_CSE_API_KEY` for web search.
38
  - `GOOGLE_CSE_NUM` — results per request, 1–10 (default 5).
39
  - `GOOGLE_CSE_SAFE` — optional safe-search level for Google `cse.list` (e.g. `off`, `active`).
40
+ - `NO_AUTO_WEB` — set to `1` / `true` / `on` to disable **automatic** chat→web upgrades (router + `/web` unchanged).
41
 
42
  ### Note
43
 
scripts/google_cse_client.py CHANGED
@@ -11,6 +11,7 @@ from __future__ import annotations
11
 
12
  import json
13
  import os
 
14
  import urllib.error
15
  import urllib.parse
16
  import urllib.request
@@ -114,3 +115,90 @@ def format_cse_hits_markdown(hits: list[CSEHit], *, for_chat: bool) -> str:
114
  f"- **Snippet:** {h.snippet}\n"
115
  )
116
  return "\n".join(lines).strip()
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
  import json
13
  import os
14
+ import re
15
  import urllib.error
16
  import urllib.parse
17
  import urllib.request
 
115
  f"- **Snippet:** {h.snippet}\n"
116
  )
117
  return "\n".join(lines).strip()
118
+
119
+
120
+ def heuristic_suggests_web_search(msg: str) -> bool:
121
+ """True if ``msg`` likely needs live web results (used when the router returns ``chat``).
122
+
123
+ Conservative: skips code-like text, slash commands, short lines, and in-app / FAQ phrasing.
124
+ """
125
+ m = (msg or "").strip().lower()
126
+ if len(m) < 12:
127
+ return False
128
+ if m.startswith("/"):
129
+ return False
130
+ if "```" in m or m.startswith("def "):
131
+ return False
132
+ if any(
133
+ x in m
134
+ for x in (
135
+ "/retrieve",
136
+ "faq excerpt",
137
+ "this space",
138
+ "this app",
139
+ "your refund",
140
+ "your policy",
141
+ "your shipping",
142
+ "your terms",
143
+ "according to your faq",
144
+ "in your documentation",
145
+ )
146
+ ):
147
+ return False
148
+
149
+ phrases = (
150
+ "latest ",
151
+ "breaking ",
152
+ "breaking news",
153
+ " as of ",
154
+ "right now",
155
+ "today ",
156
+ "today's",
157
+ "tonight",
158
+ "yesterday",
159
+ "this week",
160
+ "this month",
161
+ "current president",
162
+ "current ceo",
163
+ "current prime minister",
164
+ "who won ",
165
+ "who won the",
166
+ "election results",
167
+ "stock price",
168
+ "share price",
169
+ "market cap",
170
+ "exchange rate",
171
+ "crypto price",
172
+ "weather in",
173
+ "forecast for",
174
+ "when is the next",
175
+ "still alive",
176
+ "world cup",
177
+ "olympics",
178
+ "super bowl",
179
+ "score of",
180
+ "official announcement",
181
+ "press release",
182
+ "release date",
183
+ "when did ",
184
+ "when does ",
185
+ "google ",
186
+ "search online",
187
+ "look up online",
188
+ "on wikipedia",
189
+ "according to the news",
190
+ "news about",
191
+ "headlines",
192
+ "rumor is",
193
+ "rumour is",
194
+ "is it true that",
195
+ "fact check",
196
+ "verify online",
197
+ )
198
+ if any(p in m for p in phrases):
199
+ return True
200
+ if re.search(r"\b20(2[4-9]|[3-9][0-9])\b", m) and re.search(
201
+ r"\b(who|what|when|where|why|how|did|does|do|is|are|was|were|will|has|have)\b", m
202
+ ):
203
+ return True
204
+ return False
scripts/nl_controls.py CHANGED
@@ -650,3 +650,1284 @@ def parse_control_action(message: str) -> ControlAction | None:
650
 
651
  return None
652
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
650
 
651
  return None
652
 
653
+
654
+ # Tokens for "write the reply in …" detection (allowlist avoids "in Python" / "in 24 hours").
655
+ _REPLY_LANG_TOKENS: dict[str, str] = {
656
+ "arabic": "Arabic",
657
+ "chinese": "Chinese (Simplified)",
658
+ "czech": "Czech",
659
+ "danish": "Danish",
660
+ "dutch": "Dutch",
661
+ "english": "English",
662
+ "finnish": "Finnish",
663
+ "french": "French",
664
+ "german": "German",
665
+ "greek": "Greek",
666
+ "hebrew": "Hebrew",
667
+ "hindi": "Hindi",
668
+ "hungarian": "Hungarian",
669
+ "indonesian": "Indonesian",
670
+ "italian": "Italian",
671
+ "japanese": "Japanese",
672
+ "korean": "Korean",
673
+ "norwegian": "Norwegian",
674
+ "polish": "Polish",
675
+ "portuguese": "Portuguese",
676
+ "romanian": "Romanian",
677
+ "russian": "Russian",
678
+ "spanish": "Spanish",
679
+ "swedish": "Swedish",
680
+ "thai": "Thai",
681
+ "turkish": "Turkish",
682
+ "ukrainian": "Ukrainian",
683
+ "vietnamese": "Vietnamese",
684
+ }
685
+
686
+
687
+ _LEN_CAP_UNIT_MAX: dict[str, int] = {
688
+ "word": 2500,
689
+ "words": 2500,
690
+ "sentence": 80,
691
+ "sentences": 80,
692
+ "paragraph": 30,
693
+ "paragraphs": 30,
694
+ "line": 120,
695
+ "lines": 120,
696
+ }
697
+
698
+
699
+ def _length_cap_instruction(m: str) -> tuple[str, str] | None:
700
+ """If the user asked for a max length, return (system instruction, compact trace token)."""
701
+ if len(m) < 24:
702
+ return None
703
+ patterns = [
704
+ re.compile(
705
+ r"\b(?:in under|at most|no more than|under|within|no longer than)\s+(\d{1,4})\s+"
706
+ r"(words?|sentences?|paragraphs?|lines?)\b"
707
+ ),
708
+ re.compile(r"\b(?:max|maximum)\s+(\d{1,4})\s+(words?|sentences?|paragraphs?|lines?)\b"),
709
+ re.compile(r"\b(\d{1,4})\s+words?\s+(?:max|maximum|only|at most|or less)\b"),
710
+ ]
711
+ for rx in patterns:
712
+ mo = rx.search(m)
713
+ if not mo:
714
+ continue
715
+ n_raw, unit = mo.group(1), mo.group(2).lower()
716
+ try:
717
+ n = int(n_raw)
718
+ except ValueError:
719
+ continue
720
+ cap = _LEN_CAP_UNIT_MAX.get(unit)
721
+ if cap is None or n < 1 or n > cap:
722
+ continue
723
+ if unit.startswith("word"):
724
+ plural, short = "words", "w"
725
+ elif unit.startswith("sentence"):
726
+ plural, short = "sentences", "s"
727
+ elif unit.startswith("paragraph"):
728
+ plural, short = "paragraphs", "p"
729
+ else:
730
+ plural, short = "lines", "ln"
731
+ trace_tok = f"len_cap={n}{short}"
732
+ instr = (
733
+ f"The user requested a **tight length cap** of about **{n} {plural}** for the full assistant answer "
734
+ f"(including lists or headings). Stay at or under this cap; if it is impossible, say so in one short sentence "
735
+ "then give the closest fit."
736
+ )
737
+ return instr, trace_tok
738
+ return None
739
+
740
+
741
+ def _code_only_instruction(m: str) -> str | None:
742
+ """Detect requests for code-heavy output with almost no prose."""
743
+ if len(m) < 18:
744
+ return None
745
+ if re.search(
746
+ r"\b(just the code|code only|only code|no prose,?\s*just code|no explanation,?\s*just (?:the )?code|"
747
+ r"skip (?:the )?explanation|omit (?:the )?explanation|(?:give|send|return)\s+me\s+only\s+the\s+code|"
748
+ r"output\s+only\s+(?:the\s+)?code)\b",
749
+ m,
750
+ ):
751
+ return (
752
+ "The user asked for **code-first output**: put the working solution in **one fenced markdown code block** "
753
+ "when the answer is code; keep any non-code text to **at most one short sentence** or omit it if the code "
754
+ "is self-explanatory."
755
+ )
756
+ return None
757
+
758
+
759
+ def _guided_discovery_instruction(m: str) -> tuple[str, str] | None:
760
+ """User wants hints, nudges, or questions instead of a fully worked answer on the first reply."""
761
+ if len(m) < 36:
762
+ return None
763
+ if not re.search(
764
+ r"\b(don'?t (give|spell|hand) (me )?(the )?full (answer|solution)|don'?t spoil the solution|"
765
+ r"hints? only|only hints|guide me with (hints|questions)|nudge me (in the right direction|toward)|"
766
+ r"i want to (figure|work) it out myself|socratic(\s+method)?|"
767
+ r"lead me to (the )?answer|questions first instead of answering|"
768
+ r"without (giving|spelling) (out )?(the )?(whole )?solution)\b",
769
+ m,
770
+ ):
771
+ return None
772
+ # Require a problem-seeking cue so casual chat ("no spoilers for the movie") does not flip modes.
773
+ if not re.search(
774
+ r"\b(why|how|explain|prove|derive|solve|puzzle|homework|problem|exercise|bug|code|implement|"
775
+ r"design|compare|understand|learn|teach|practice|algorithm|proof|debug|refactor)\b",
776
+ m,
777
+ ):
778
+ return None
779
+ instr = (
780
+ "The user asked for **guided discovery** (Socratic / hint-first): prefer short **questions**, "
781
+ "**nudges**, and **partial hints** over a complete solution in this turn. "
782
+ "If one concrete step is essential, show **at most one** move, then check whether they want to continue. "
783
+ "Offer the full worked answer if they say they are stuck or ask you to finish."
784
+ )
785
+ return instr, "guided"
786
+
787
+
788
+ def _ephemeral_privacy_instruction(m: str) -> tuple[str, str] | None:
789
+ """User asked not to treat this turn as content to persist (memory / logging)."""
790
+ if len(m) < 22:
791
+ return None
792
+ if re.search(
793
+ r"\b(off the record|no memory for this|nothing persisted|ephemeral question|ephemeral chat|"
794
+ r"don'?t log this|don'?t persist this|"
795
+ r"don'?t (?:remember|store) (?:this|that|it|anything)|"
796
+ r"do not (?:remember|store) (?:this|that|it)|"
797
+ r"please don'?t (?:remember|store) (?:this|that|it)|"
798
+ r"forget this after|don'?t save (?:this|that)\s+to\s+memory)\b",
799
+ m,
800
+ ):
801
+ instr = (
802
+ "The user signaled **ephemeral intent** for this reply: do **not** invite `/remember`, `/session`, or "
803
+ "long-term note-taking for this content; avoid urging them to store secrets, API keys, or passwords. "
804
+ "Still answer helpfully within normal safety and deployment limits."
805
+ )
806
+ return instr, "ephemeral"
807
+
808
+ return None
809
+
810
+
811
+ def _accessibility_sr_instruction(m: str) -> tuple[str, str] | None:
812
+ """User wants screen-reader / WCAG-minded answer structure (linear, semantic headings)."""
813
+ if len(m) < 44:
814
+ return None
815
+ if not re.search(
816
+ r"\b(screen[- ]?reader|screenreader|nvda|jaws|voiceover|talkback|orca|"
817
+ r"wcag(?:\s+[0-9]{1,2}(?:\.[0-9])?)?|\ba11y\b|accessibility|accessible to|"
818
+ r"blind users?|low vision|visually impaired)\b",
819
+ m,
820
+ ):
821
+ return None
822
+ audience = re.search(
823
+ r"\bfor\s+(?:blind|low-vision|screen[- ]?reader|a11y)\s+(?:users?|readers?|audiences?|visitors?)?\b",
824
+ m,
825
+ )
826
+ format_rq = re.search(
827
+ r"\b(friendly|friendlier|structure|structured|layout|linear|heading|headings|semantic|"
828
+ r"readable|reformat|format this|annotate|describe (?:the\s+)?(?:chart|diagram|figure|image)|"
829
+ r"please (?:reply|answer|write|help|summarize|reformat|structure)|"
830
+ r"how (?:should|can) i (?:write|format|publish))\b",
831
+ m,
832
+ )
833
+ if not audience and not format_rq:
834
+ return None
835
+ instr = (
836
+ "The user asked for **screen-reader–friendly / accessibility-aware** formatting: prefer a **clear linear reading order**; "
837
+ "use real markdown heading lines for sections when the answer is long; do **not** rely on a table as the **only** "
838
+ "place critical facts appear—repeat key facts in prose if you use a table; briefly describe any chart or diagram "
839
+ "in words; keep emoji sparse and never the sole carrier of meaning."
840
+ )
841
+ return instr, "a11y"
842
+
843
+
844
+ def _embedded_simple_audience(m: str) -> bool:
845
+ """True if a longer prompt asks for child-level / lay explanations (ELI5-style) in prose."""
846
+ if len(m) < 40:
847
+ return False
848
+ if re.search(
849
+ r"\b(expert mode|technical audience|assume i'?m technical|phd level|for experts|deep technical)\b",
850
+ m,
851
+ ):
852
+ return False
853
+ if not re.search(
854
+ r"\b(eli5|explain like i'?m(?:\s+a)? five|like i'?m(?:\s+a)? five\b|"
855
+ r"for (?:my )?kids to understand|total beginner|i'?m\s+a\s+beginner\b|beginner\s+here\b|"
856
+ r"non-technical (?:parent|reader|manager|audience)|"
857
+ r"lay audience|no technical background|zero prior knowledge)\b",
858
+ m,
859
+ ):
860
+ return False
861
+ return bool(
862
+ re.search(
863
+ r"\b(why|how|what|when|where|explain|describe|tell me|help me (?:to )?understand|walk me through|"
864
+ r"learn about|new to)\b",
865
+ m,
866
+ )
867
+ )
868
+
869
+
870
+ def _embedded_technical_audience(m: str) -> bool:
871
+ """True if a longer prompt asks for expert-depth explanations (not short *Expert mode* controls)."""
872
+ if len(m) < 40:
873
+ return False
874
+ if re.search(
875
+ r"\b(eli5|explain like i'?m(?:\s+a)? five|total beginner|i'?m\s+a\s+beginner\b|beginner\s+here\b|"
876
+ r"lay audience|no technical background|zero prior knowledge|explain simply)\b",
877
+ m,
878
+ ):
879
+ return False
880
+ if not re.search(
881
+ r"\b(expert mode|technical audience|assume i'?m technical|phd level|for experts|deep technical|"
882
+ r"staff engineer audience|senior (?:sre|eng|engineer) audience|"
883
+ r"use jargon freely|skip the basics|don'?t dumb (?:it )?down|"
884
+ r"peer[- ]level technical|internals[- ]focused|implementation[- ]heavy)\b",
885
+ m,
886
+ ):
887
+ return False
888
+ return bool(
889
+ re.search(
890
+ r"\b(why|how|what|when|where|explain|describe|tell me|walk me through|"
891
+ r"compare|design|architect|debug|troubleshoot|analyze|review|implement)\b",
892
+ m,
893
+ )
894
+ )
895
+
896
+
897
+ def _embedded_register_tone(m: str) -> str | None:
898
+ """One-shot formal vs casual register when prose names an audience (not the short *Formal tone* control)."""
899
+ if len(m) < 48:
900
+ return None
901
+ formal = re.search(
902
+ r"\b(board-ready|for regulators|regulatory filing|formal memo|audit[- ]friendly|"
903
+ r"client-facing|for (?:the\s+)?board(?:\s+of\s+directors)?\b|for leadership review|"
904
+ r"executive summary for|c[- ]suite|for executives|board presentation|investor[- ]ready|"
905
+ r"sec filing tone)\b",
906
+ m,
907
+ )
908
+ casual = re.search(
909
+ r"\b(slack message|teams message to the team|keep it casual|casual tone|friendly teammate|"
910
+ r"like you(?:'re|\s+are)\s+my coworker|water cooler|informal note|keep it light|"
911
+ r"pub chat|chatty tone)\b",
912
+ m,
913
+ )
914
+ if formal and not casual:
915
+ return "formal"
916
+ if casual and not formal:
917
+ return "casual"
918
+ return None
919
+
920
+
921
+ def _embedded_json_output(m: str) -> bool:
922
+ """True if a longer prompt asks for JSON-shaped output (not the short *Answer in JSON* control line)."""
923
+ if len(m) < 40:
924
+ return False
925
+ if re.search(
926
+ r"\b(no json|not json|avoid json|skip json|plain text only|no structured output|"
927
+ r"don'?t use json|without json)\b",
928
+ m,
929
+ ):
930
+ return False
931
+ return bool(
932
+ re.search(
933
+ r"\b(valid json|return json|reply in json|answer in json|json output|structured json|"
934
+ r"json object|json array|as json\b|as a json|machine[- ]readable json|emit json|"
935
+ r"serialize (?:to|as) json|output as json|respond with json)\b",
936
+ m,
937
+ )
938
+ )
939
+
940
+
941
+ def _embedded_speculation(m: str) -> str | None:
942
+ """``strict`` vs ``creative`` speculation level (not short *No speculation* / *Brainstorm freely* controls)."""
943
+ if len(m) < 44:
944
+ return None
945
+ creative = bool(
946
+ re.search(
947
+ r"\b(brainstorm freely|speculate freely|wild ideas(?:\s+ok)?|creative speculation|"
948
+ r"go ahead and guess|reasonable guesses welcome|speculate a bit|"
949
+ r"blue[- ]sky (?:thinking|ideas)|throw out (?:some )?possibilities|"
950
+ r"explore hypotheticals|what[- ]if scenarios (?:are )?welcome|"
951
+ r"ideation (?:mode|session)|open[- ]ended brainstorming)\b",
952
+ m,
953
+ )
954
+ )
955
+ strict = bool(
956
+ re.search(
957
+ r"\b(don'?t guess|no guessing|avoid guessing|only high confidence|stick to (?:the\s+)?facts|"
958
+ r"avoid halluc|no hallucinations|don'?t hallucinate|if you don'?t know say|"
959
+ r"if unsure say|say when you(?:'re|\s+are)\s+unsure|no speculation|avoid speculation|"
960
+ r"don'?t speculate|fact[- ]checked|grounded only|evidence[- ]based only|"
961
+ r"only if (?:you(?:'re|\s+are)\s+)?(?:certain|sure)|do not invent (?:facts|numbers))\b",
962
+ m,
963
+ )
964
+ )
965
+ if creative and strict:
966
+ return None
967
+ if strict:
968
+ return "strict"
969
+ if creative:
970
+ return "creative"
971
+ return None
972
+
973
+
974
+ def _embedded_answer_lead(m: str) -> str | None:
975
+ """``tldr_first`` vs ``direct`` answer opening (not short *TLDR first* / *Answer directly* controls)."""
976
+ if len(m) < 44:
977
+ return None
978
+ direct = bool(
979
+ re.search(
980
+ r"\b(no tldr|skip (?:the )?summary|answer directly|without a (?:summary|tldr)|"
981
+ r"no executive summary|don'?t (?:add|give) a tldr|direct answer only|"
982
+ r"jump straight to the answer|no summary (?:upfront|at the top)|"
983
+ r"get straight to the (?:answer|point)|omit (?:the )?(?:opening )?summary)\b",
984
+ m,
985
+ )
986
+ )
987
+ tldr = bool(
988
+ re.search(
989
+ r"\b(tl;?dr first|tldr first|lead with (?:a\s+)?(?:one[- ]line\s+)?summary|summary first|"
990
+ r"executive summary first|bottom line up front|bluf|"
991
+ r"start with (?:a\s+)?(?:short\s+)?summary|headline first|"
992
+ r"give me the (?:key\s+)?takeaway first)\b",
993
+ m,
994
+ )
995
+ )
996
+ if direct and tldr:
997
+ return None
998
+ if direct:
999
+ return "direct"
1000
+ if tldr:
1001
+ return "tldr_first"
1002
+ return None
1003
+
1004
+
1005
+ def _embedded_actionability(m: str) -> str | None:
1006
+ """``commands`` vs ``conceptual`` actionability (not short *Make it actionable* / *Conceptual only* controls)."""
1007
+ if len(m) < 44:
1008
+ return None
1009
+ conceptual = bool(
1010
+ re.search(
1011
+ r"\b(conceptual only|high level only|no commands|without commands|no shell commands|"
1012
+ r"avoid command dumps|don'?t include (?:bash|shell|terminal) commands|"
1013
+ r"focus on concepts(?:\s+and\s+rationale)?|strategic discussion only|"
1014
+ r"architecture overview only|no runnable snippets|theory and tradeoffs only)\b",
1015
+ m,
1016
+ )
1017
+ )
1018
+ commands = bool(
1019
+ re.search(
1020
+ r"\b(include (?:a\s+)?(?:bash|sh|zsh|powershell) snippet|run(?:nable)? commands?|"
1021
+ r"copy[- ]paste (?:into )?(?:the\s+)?(?:terminal|shell)|"
1022
+ r"curl (?:one[- ]?liner|example)|one[- ]liner (?:for|to)|"
1023
+ r"bash one[- ]liner|powershell command|terminal commands?|"
1024
+ r"give me (?:the\s+)?(?:exact\s+)?commands?|include kubectl|"
1025
+ r"docker (?:run|compose) (?:example|snippet)|(?:pip|npm|pnpm|yarn) install (?:line|command)|"
1026
+ r"ready[- ]to[- ]run (?:script|snippet)|paste(?:able)? commands?)\b",
1027
+ m,
1028
+ )
1029
+ )
1030
+ if conceptual and commands:
1031
+ return None
1032
+ if conceptual:
1033
+ return "conceptual"
1034
+ if commands:
1035
+ return "commands"
1036
+ return None
1037
+
1038
+
1039
+ def _embedded_confidence_tone(m: str) -> str | None:
1040
+ """``transparent`` vs ``assertive`` confidence (not short *Flag assumptions* / *Be decisive* controls)."""
1041
+ if len(m) < 44:
1042
+ return None
1043
+ skip_transparent = bool(
1044
+ re.search(
1045
+ r"\b(no assumptions? section|skip (?:the\s+)?assumptions?|don'?t list assumptions|"
1046
+ r"without caveats|no caveats|omit limitations)\b",
1047
+ m,
1048
+ )
1049
+ )
1050
+ transparent = (
1051
+ not skip_transparent
1052
+ and (
1053
+ bool(
1054
+ re.search(
1055
+ r"\b(state|list|spell out|call out|identify|enumerate|label)\s+"
1056
+ r"(?:your\s+|the\s+|our\s+|key\s+|main\s+)?(?:key\s+|main\s+)?assumptions?\b",
1057
+ m,
1058
+ )
1059
+ )
1060
+ or bool(
1061
+ re.search(
1062
+ r"\b(assumptions?\s+and\s+limitations?|limitations?\s+and\s+caveats?|"
1063
+ r"limitations?\s+section|caveats?\s+(?:first|upfront|at\s+the\s+top)|"
1064
+ r"upfront\s+caveats?|scope\s+and\s+assumptions?|boundary\s+conditions?|"
1065
+ r"what\s+(?:we\s+)?(?:are\s+)?assuming\b|"
1066
+ r"explicit(?:ly)?\s+about\s+(?:limitations?|uncertainty|what\s+we\s+don'?t\s+know)|"
1067
+ r"where\s+this\s+(?:breaks?\s+down|stops?\s+working|doesn'?t\s+apply))\b",
1068
+ m,
1069
+ )
1070
+ )
1071
+ or bool(
1072
+ re.search(
1073
+ r"\b(flag|surface|highlight)\s+(?:key\s+)?(?:uncertainties|unknowns|gaps|risk\s+factors)\b",
1074
+ m,
1075
+ )
1076
+ )
1077
+ )
1078
+ )
1079
+ assertive = bool(
1080
+ re.search(
1081
+ r"\b(be decisive|don'?t hedge|give firm answers?|minimal hedging|"
1082
+ r"sound\s+confident|avoid disclaimers|confident (?:recommendation|tone)|"
1083
+ r"take a clear stance|no throat[- ]clearing|decisive recommendation)\b",
1084
+ m,
1085
+ )
1086
+ )
1087
+ if transparent and assertive:
1088
+ return None
1089
+ if transparent:
1090
+ return "transparent"
1091
+ if assertive:
1092
+ return "assertive"
1093
+ return None
1094
+
1095
+
1096
+ def _embedded_example_density(m: str) -> str | None:
1097
+ """``rich`` or ``sparse`` from prose (not the short *Include examples* / *Skip examples* control lines)."""
1098
+ if len(m) < 44:
1099
+ return None
1100
+ sparse = bool(
1101
+ re.search(
1102
+ r"\b(skip examples?|don'?t add examples?|don'?t include examples?|"
1103
+ r"without examples?|keep (?:it\s+)?abstract|theory[- ]only|abstract only|"
1104
+ r"example[- ]free|no examples? (?:please|in your (?:answer|reply))|"
1105
+ r"avoid illustrative examples?)\b",
1106
+ m,
1107
+ )
1108
+ )
1109
+ rich = bool(
1110
+ re.search(
1111
+ r"\b(include (?:at\s+least\s+)?(?:one|two|a few)\s+concrete examples?|"
1112
+ r"at least one (?:short\s+)?concrete example|"
1113
+ r"illustrate (?:this|that|it)\s+with (?:a\s+)?(?:concrete\s+|real[- ]world\s+)?example|"
1114
+ r"worked example|walk(?:\s+me)? through (?:a\s+)?(?:small|tiny|toy|minimal)\s+example|"
1115
+ r"\b(?:a\s+)?toy example\b|miniature scenario|"
1116
+ r"ground (?:this|it|your answer) in (?:a\s+)?(?:concrete\s+)?example|"
1117
+ r"give (?:me\s+)?a (?:concrete\s+)?example|"
1118
+ r"show (?:me\s+)?(?:this\s+)?with (?:a\s+)?(?:concrete\s+)?example)\b",
1119
+ m,
1120
+ )
1121
+ )
1122
+ if sparse and rich:
1123
+ return None
1124
+ if sparse:
1125
+ return "sparse"
1126
+ if rich:
1127
+ return "rich"
1128
+ return None
1129
+
1130
+
1131
+ def _embedded_exposition_order(m: str) -> str | None:
1132
+ """``definitions_first`` vs ``intuition_first`` from prose (not short *Definitions first* controls)."""
1133
+ if len(m) < 48:
1134
+ return None
1135
+ if re.search(
1136
+ r"\b(skip definitions first|don'?t start with definitions|"
1137
+ r"no formal definitions upfront)\b",
1138
+ m,
1139
+ ):
1140
+ return None
1141
+ if re.search(
1142
+ r"\b(skip the intuition|cut the warm[- ]?up|no hand[- ]?wavy intro)\b",
1143
+ m,
1144
+ ):
1145
+ return None
1146
+ defn = bool(
1147
+ re.search(
1148
+ r"\b(define (?:the\s+)?(?:key\s+)?terms? (?:first|before|upfront)|"
1149
+ r"definitions?\s+(?:first|before|upfront)|"
1150
+ r"start with (?:a\s+)?(?:brief\s+)?definition|"
1151
+ r"formal definitions? (?:first|before)|"
1152
+ r"precise definitions? before|"
1153
+ r"terminology (?:first|upfront)|"
1154
+ r"establish definitions before|"
1155
+ r"glossary[- ]style (?:intro|opening)|"
1156
+ r"define jargon before)\b",
1157
+ m,
1158
+ )
1159
+ )
1160
+ intu = bool(
1161
+ re.search(
1162
+ r"\b(intuition (?:first|before (?:the\s+)?(?:math|formal|proof|details?))|"
1163
+ r"big[- ]picture (?:first|before|then)|"
1164
+ r"high[- ]level intuition (?:first|before)|"
1165
+ r"motivation before (?:the\s+)?(?:formal|proof|math)|"
1166
+ r"informal (?:picture|overview) before|"
1167
+ r"start with (?:the\s+)?(?:big\s+picture|intuition|high[- ]level sketch)|"
1168
+ r"warm(?:\s+up)? with (?:an?\s+)?intuitive)\b",
1169
+ m,
1170
+ )
1171
+ )
1172
+ if defn and intu:
1173
+ return None
1174
+ if defn:
1175
+ return "definitions_first"
1176
+ if intu:
1177
+ return "intuition_first"
1178
+ return None
1179
+
1180
+
1181
+ def _embedded_followup_close(m: str) -> str | None:
1182
+ """``minimal`` vs ``suggest`` from prose (not short *No follow-up questions* controls)."""
1183
+ if len(m) < 48:
1184
+ return None
1185
+ minimal = bool(
1186
+ re.search(
1187
+ r"\b(no questions? at the end|don'?t (?:ask|end) with (?:a\s+)?questions?|"
1188
+ r"don'?t ask if i need (?:anything|more) else|don'?t ask whether i need more|"
1189
+ r"skip (?:the\s+)?(?:stock\s+)?closer|no follow[- ]up questions (?:at\s+the\s+)?(?:end|please)?|"
1190
+ r"don'?t prompt for follow[- ]ups?|finish crisply|stop after the core answer|"
1191
+ r"avoid rhetorical closers?|no offers? to help further|"
1192
+ r"don'?t (?:close|end) with (?:an?\s+)?(?:offer|invitation) to continue)\b",
1193
+ m,
1194
+ )
1195
+ )
1196
+ suggest = bool(
1197
+ re.search(
1198
+ r"\b(suggest next steps|optional next steps at the end|"
1199
+ r"end with (?:brief\s+)?(?:actionable\s+)?next steps|"
1200
+ r"close with suggested next actions|what should we do next|"
1201
+ r"offer ways to go deeper|give me follow[- ]ups? i can take|"
1202
+ r"recommend what to do next|include (?:optional\s+)?next steps)\b",
1203
+ m,
1204
+ )
1205
+ )
1206
+ if minimal and suggest:
1207
+ return None
1208
+ if minimal:
1209
+ return "minimal"
1210
+ if suggest:
1211
+ return "suggest"
1212
+ return None
1213
+
1214
+
1215
+ def _embedded_clarify_first(m: str) -> str | None:
1216
+ """``on`` vs ``off`` from prose (not short *Clarify first* / *No clarifying questions* controls)."""
1217
+ if len(m) < 48:
1218
+ return None
1219
+ off = bool(
1220
+ re.search(
1221
+ r"\b(no clarifying questions (?:first|please)?|don'?t ask clarifying questions|"
1222
+ r"skip clarifying questions|answer without asking questions first|"
1223
+ r"don'?t (?:pause to\s+)?ask questions first|"
1224
+ r"give (?:your\s+)?best answer without asking|"
1225
+ r"don'?t interrogate me first|skip the q&a preamble|"
1226
+ r"answer immediately (?:even\s+)?if (?:the\s+)?(?:spec|specs) (?:is|are) incomplete)\b",
1227
+ m,
1228
+ )
1229
+ )
1230
+ on = bool(
1231
+ re.search(
1232
+ r"\b(ask clarifying questions before (?:you\s+)?answer|"
1233
+ r"clarify (?:any\s+)?ambiguities before|"
1234
+ r"if anything is unclear ask me first|"
1235
+ r"before you (?:answer|dive in) ask (?:me\s+)?(?:what\s+you\s+need|any questions)|"
1236
+ r"pause and ask (?:me\s+)?(?:short\s+)?questions if|"
1237
+ r"confirm my (?:constraints|requirements) before|"
1238
+ r"ask what you need (?:to know )?first|"
1239
+ r"i may have left details out[-—]\s*ask|"
1240
+ r"feel free to ask (?:me\s+)?(?:1[-–]3\s+)?clarifying questions first)\b",
1241
+ m,
1242
+ )
1243
+ )
1244
+ if on and off:
1245
+ return None
1246
+ if off:
1247
+ return "off"
1248
+ if on:
1249
+ return "on"
1250
+ return None
1251
+
1252
+
1253
+ def _embedded_section_headings(m: str) -> str | None:
1254
+ """``prefer`` vs ``avoid`` for markdown ##/### structure (not short *Use section headings* controls)."""
1255
+ if len(m) < 48:
1256
+ return None
1257
+ avoid = bool(
1258
+ re.search(
1259
+ r"\b(flat answer|no section headings|avoid markdown headings|"
1260
+ r"no (?:##|hash)\s*style headings|without (?:##|markdown) title lines|"
1261
+ r"continuous prose (?:only|without headings)|"
1262
+ r"don'?t use (?:leading\s+)?#+\s*headings?|"
1263
+ r"skip (?:the\s+)?##\s*headers?)\b",
1264
+ m,
1265
+ )
1266
+ )
1267
+ prefer = bool(
1268
+ re.search(
1269
+ r"\b(use (?:markdown\s+)?(?:section\s+)?headings|organize with (?:markdown\s+)?headings|"
1270
+ r"structure (?:the\s+)?answer with (?:clear\s+)?headings|"
1271
+ r"break (?:it|this|the answer) into (?:titled\s+)?sections|"
1272
+ r"(?:clear\s+)?markdown headings for each|"
1273
+ r"##\s*(?:or|/)\s*###\s*headings|"
1274
+ r"top[- ]level headings for each (?:major\s+)?(?:topic|section))\b",
1275
+ m,
1276
+ )
1277
+ )
1278
+ if avoid and prefer:
1279
+ return None
1280
+ if avoid:
1281
+ return "avoid"
1282
+ if prefer:
1283
+ return "prefer"
1284
+ return None
1285
+
1286
+
1287
+ def _embedded_analogy_use(m: str) -> str | None:
1288
+ """``prefer`` vs ``avoid`` for analogies/metaphors (not short *Use analogies* / *No analogies* controls)."""
1289
+ if len(m) < 48:
1290
+ return None
1291
+ avoid = bool(
1292
+ re.search(
1293
+ r"\b(no analogies|skip metaphors|avoid metaphors|skip the analogies|"
1294
+ r"without analogies or metaphors|literal (?:explanations?|wording) only|"
1295
+ r"don'?t use analogies|don'?t use metaphors|no cute comparisons|"
1296
+ r"stick to literal (?:technical\s+)?(?:language|description|wording)|"
1297
+ r"keep (?:it\s+)?strictly literal)\b",
1298
+ m,
1299
+ )
1300
+ )
1301
+ prefer = bool(
1302
+ re.search(
1303
+ r"\b(use (?:a\s+)?(?:helpful\s+|tight\s+)?analogy|"
1304
+ r"explain (?:it\s+)?with (?:a\s+)?(?:simple\s+)?(?:real[- ]world\s+)?analogy|"
1305
+ r"include (?:a\s+)?(?:brief\s+)?(?:metaphor|analogy)|"
1306
+ r"liken (?:this|it) to (?:something|a\s+familiar)|"
1307
+ r"compare (?:this|it)\s+to (?:a\s+)?(?:real[- ]world|everyday)|"
1308
+ r"map (?:this|it) to an everyday example|"
1309
+ r"metaphor that helps|ground (?:the\s+)?idea in (?:an?\s+)?analogy)\b",
1310
+ m,
1311
+ )
1312
+ )
1313
+ if avoid and prefer:
1314
+ return None
1315
+ if avoid:
1316
+ return "avoid"
1317
+ if prefer:
1318
+ return "prefer"
1319
+ return None
1320
+
1321
+
1322
+ def _embedded_term_emphasis(m: str) -> str | None:
1323
+ """``highlight`` vs ``minimal`` inline bold (not short *Bold key terms* controls)."""
1324
+ if len(m) < 48:
1325
+ return None
1326
+ minimal = bool(
1327
+ re.search(
1328
+ r"\b(minimal bold|don'?t overuse bold|avoid excessive bold|"
1329
+ r"sparse bold|keep bold (?:to a )?minimum|"
1330
+ r"no bold except (?:for )?code|plain text without bold|"
1331
+ r"don'?t bold every|avoid bolding (?:whole|entire) sentences)\b",
1332
+ m,
1333
+ )
1334
+ )
1335
+ highlight = bool(
1336
+ re.search(
1337
+ r"\b(bold (?:the\s+)?(?:key\s+)?terms|highlight (?:the\s+)?(?:key\s+)?(?:terms|phrases)|"
1338
+ r"emphasize (?:the\s+)?(?:key\s+)?(?:terms|keywords)|"
1339
+ r"make (?:the\s+)?key terms stand out|"
1340
+ r"use bold (?:on|for) (?:a\s+)?(?:few\s+)?(?:key\s+)?(?:terms|phrases|keywords)|"
1341
+ r"so (?:execs|leadership|managers) can scan.{0,50}bold)\b",
1342
+ m,
1343
+ )
1344
+ )
1345
+ if minimal and highlight:
1346
+ return None
1347
+ if minimal:
1348
+ return "minimal"
1349
+ if highlight:
1350
+ return "highlight"
1351
+ return None
1352
+
1353
+
1354
+ def _embedded_acronym_style(m: str) -> str | None:
1355
+ """``spell_out`` vs ``terse`` acronym handling (not short *Spell out acronyms* controls)."""
1356
+ if len(m) < 48:
1357
+ return None
1358
+ terse = bool(
1359
+ re.search(
1360
+ r"\b(assume (?:i|we) know acronyms|don'?t expand acronyms|"
1361
+ r"keep acronyms as[- ]is|skip acronym expansion|"
1362
+ r"no need to spell out acronyms|acronym[- ]literate (?:audience|readers?)|"
1363
+ r"terse acronyms only)\b",
1364
+ m,
1365
+ )
1366
+ )
1367
+ spell = bool(
1368
+ re.search(
1369
+ r"\b(spell out acronyms|expand acronyms (?:on|at) first use|"
1370
+ r"define acronyms when you (?:use|introduce)|"
1371
+ r"write out acronyms (?:on|at) first mention|"
1372
+ r"full form (?:once|on first mention).{0,40}(?:acronym|initialism)|"
1373
+ r"expand (?:each\s+)?(?:api|sla|sso|gdpr|hipaa|pci)[- ]style (?:term|acronym)|"
1374
+ r"for (?:auditors|compliance|non-technical).{0,50}spell out)\b",
1375
+ m,
1376
+ )
1377
+ )
1378
+ if spell and terse:
1379
+ return None
1380
+ if terse:
1381
+ return "terse"
1382
+ if spell:
1383
+ return "spell_out"
1384
+ return None
1385
+
1386
+
1387
+ def _embedded_risk_posture(m: str) -> str | None:
1388
+ """``conservative`` vs ``pragmatic`` recommendation tone (not short *Be risk averse* controls)."""
1389
+ if len(m) < 48:
1390
+ return None
1391
+ pragmatic = bool(
1392
+ re.search(
1393
+ r"\b(optimize for speed|good enough is fine|be pragmatic about|"
1394
+ r"avoid over[- ]engineering|ship (?:it )?fast|move fast (?:and|&)|"
1395
+ r"time[- ]efficient (?:fix|approach|recommendation)|"
1396
+ r"practical trade[- ]offs over perfection|"
1397
+ r"don'?t gold[- ]plate|bias toward shipping)\b",
1398
+ m,
1399
+ )
1400
+ )
1401
+ conservative = bool(
1402
+ re.search(
1403
+ r"\b(err on the side of safety|be risk[- ]averse|"
1404
+ r"risk[- ]averse (?:recommendation|approach)|"
1405
+ r"choose the (?:safest|lower[- ]risk) option|"
1406
+ r"minimize (?:downside|blast radius)|"
1407
+ r"prefer (?:safer|low[- ]risk) (?:options?|paths?)|"
1408
+ r"conservative (?:recommendation|rollout|approach)|"
1409
+ r"safety[- ]first (?:for|on) (?:this|the) (?:rollout|migration|change))\b",
1410
+ m,
1411
+ )
1412
+ )
1413
+ if conservative and pragmatic:
1414
+ return None
1415
+ if conservative:
1416
+ return "conservative"
1417
+ if pragmatic:
1418
+ return "pragmatic"
1419
+ return None
1420
+
1421
+
1422
+ def _embedded_quote_style(m: str) -> str | None:
1423
+ """``quote`` vs ``paraphrase`` when relying on supplied FAQ excerpts (not short *Quote the FAQ* controls)."""
1424
+ if len(m) < 48:
1425
+ return None
1426
+ src = r"(?:faq|excerpt|policy|knowledge base|kb article|documentation|provided (?:text|docs))"
1427
+ paraphrase = bool(
1428
+ re.search(
1429
+ rf"\b(paraphrase (?:the )?{src}|paraphrase only|"
1430
+ rf"(?:don'?t|do not) quote (?:the )?{src}|no direct quotes? from (?:the )?{src}|"
1431
+ rf"summarize (?:the )?{src} in your own words|"
1432
+ rf"avoid quoting (?:the )?{src}|in your own words.{0,40}(?:faq|excerpt))\b",
1433
+ m,
1434
+ )
1435
+ )
1436
+ quote = bool(
1437
+ re.search(
1438
+ rf"\b((?<!not )(?<!don't )quote (?:the )?{src}|direct quotes? from (?:the )?{src}|"
1439
+ rf"cite with (?:direct )?quotes? when (?:you )?(?:use|reference) (?:the )?{src}|"
1440
+ rf"verbatim (?:quotes?|passages?) from (?:the )?{src}|"
1441
+ rf"include (?:a )?(?:short )?verbatim quote.{0,50}(?:faq|excerpt)|"
1442
+ rf"when you rely on (?:the )?{src}.{0,50}quote)\b",
1443
+ m,
1444
+ )
1445
+ )
1446
+ if quote and paraphrase:
1447
+ return None
1448
+ if quote:
1449
+ return "quote"
1450
+ if paraphrase:
1451
+ return "paraphrase"
1452
+ return None
1453
+
1454
+
1455
+ def _embedded_emoji_style(m: str) -> str | None:
1456
+ """``include`` vs ``avoid`` emoji in replies (not short *Use emoji* / *No emoji* controls)."""
1457
+ if len(m) < 48:
1458
+ return None
1459
+ avoid = bool(
1460
+ re.search(
1461
+ r"\b(no emojis? in (?:your|the) reply|avoid emoji|emoji[- ]free (?:reply|tone)|"
1462
+ r"don'?t use emoji|do not use emoji|keep (?:it\s+)?(?:strictly\s+)?professional.{0,40}no emoji|"
1463
+ r"without emoji|skip (?:the\s+)?emoji|no cute emoji|"
1464
+ r"plain text only.{0,30}no emoji)\b",
1465
+ m,
1466
+ )
1467
+ )
1468
+ include = bool(
1469
+ re.search(
1470
+ r"\b(use (?:a few\s+)?(?:tasteful\s+)?emoji|include emoji|emoji (?:are|is) ok|"
1471
+ r"emoji welcome|feel free to use emoji|sprinkle (?:in\s+)?emoji|"
1472
+ r"a few emoji (?:are|is) fine|light emoji (?:are|is) ok|"
1473
+ r"you may use emoji|add (?:a few\s+)?emoji (?:if|when) (?:helpful|appropriate))\b",
1474
+ m,
1475
+ )
1476
+ )
1477
+ if avoid and include:
1478
+ return None
1479
+ if avoid:
1480
+ return "avoid"
1481
+ if include:
1482
+ return "include"
1483
+ return None
1484
+
1485
+
1486
+ def _embedded_counterpoint_tone(m: str) -> str | None:
1487
+ """``challenge`` vs ``supportive`` pushback on plans (not short *Challenge my assumptions* controls)."""
1488
+ if len(m) < 52:
1489
+ return None
1490
+ ctx = (
1491
+ r"\b(plan|plans|design|approach|idea|ideas|architecture|proposal|strategy|"
1492
+ r"implementation|rollout|pitch|deck|draft|thesis|launch|release|migration|schema|"
1493
+ r"deployment|code|system)\b"
1494
+ )
1495
+ gentle = bool(
1496
+ re.search(r"\b(don'?t challenge|be gentle|go easy on me|no criticism|don'?t be harsh)\b", m)
1497
+ )
1498
+ challenge = bool(
1499
+ not gentle
1500
+ and re.search(
1501
+ r"\b(red team|red-team|stress[- ]?test|pick apart|tear down|what am i missing|sanity check|"
1502
+ r"challenge my|poke holes|find (?:weaknesses|gaps|flaws)|critique (?:my|this|our)|"
1503
+ r"devil'?s advocate)\b",
1504
+ m,
1505
+ )
1506
+ and re.search(ctx, m)
1507
+ )
1508
+ supportive = bool(
1509
+ re.search(
1510
+ r"\b(be supportive (?:of|about|with)|assume good intent|encourage my (?:idea|plan|proposal)|"
1511
+ r"constructive and supportive|help me build on (?:this|my) (?:idea|plan)|"
1512
+ r"coach me through (?:this|my) (?:idea|plan|pitch)|"
1513
+ r"gentle (?:feedback|pushback) on (?:my|this|our)|"
1514
+ r"avoid harsh criticism|frame improvements as next steps|"
1515
+ r"lean supportive and (?:specific|actionable))\b",
1516
+ m,
1517
+ )
1518
+ and re.search(ctx, m)
1519
+ )
1520
+ if challenge and supportive:
1521
+ return None
1522
+ if challenge:
1523
+ return "challenge"
1524
+ if supportive:
1525
+ return "supportive"
1526
+ return None
1527
+
1528
+
1529
+ def _embedded_math_detail(m: str) -> str | None:
1530
+ """``show_work`` vs ``final_only`` for math-like answers (not short *Show your work* controls)."""
1531
+ if len(m) < 44:
1532
+ return None
1533
+ show = bool(
1534
+ re.search(
1535
+ r"\b(show your work|show (?:all )?(?:the )?steps|with (?:a )?derivation|prove (that|it)|rigorously|"
1536
+ r"walk through (?:the )?derivation|show intermediate steps|step[- ]by[- ]step derivation|"
1537
+ r"derive (?:it|the result) (?:step by step|explicitly))\b",
1538
+ m,
1539
+ )
1540
+ )
1541
+ final = bool(
1542
+ re.search(
1543
+ r"\b(final answer only|no derivation|skip (?:the )?steps|just (?:give )?(?:me )?the (?:final )?result|"
1544
+ r"don'?t show your work|do not show your work|answer without (?:showing )?steps|"
1545
+ r"no intermediate steps|closed[- ]form (?:answer|result) only|"
1546
+ r"(?:numerical |numeric )?answer only.{0,30}no steps)\b",
1547
+ m,
1548
+ )
1549
+ )
1550
+ if show and final:
1551
+ return None
1552
+ if show:
1553
+ return "show_work"
1554
+ if final and re.search(
1555
+ r"\b(math|equation|integral|derivative|probability|calculate|calculus|algebra|"
1556
+ r"proof|formula|theorem|matrix|solve|statistics|bayes|variance|expected value)\b",
1557
+ m,
1558
+ ):
1559
+ return "final_only"
1560
+ return None
1561
+
1562
+
1563
+ def _embedded_faq_grounding(m: str) -> str | None:
1564
+ """``strict`` vs ``relaxed`` FAQ/RAG grounding (not short *Strict FAQ* / *Relaxed FAQ* controls)."""
1565
+ if len(m) < 48:
1566
+ return None
1567
+ src = r"(?:faq|excerpt|policy|knowledge base|kb article|documentation|provided excerpts|retrieved passages)"
1568
+ strict = bool(
1569
+ re.search(
1570
+ rf"\b(stick to (?:the )?{src}|only use (?:the )?{src}|"
1571
+ rf"only trust (?:the )?{src}|faq[- ]only (?:for|on) (?:this|the)|"
1572
+ rf"strict faq (?:grounding|only)|if (?:it(?:'s| is) )?not in (?:the )?{src}.{0,40}(?:say|admit)|"
1573
+ rf"don'?t go beyond (?:the )?{src}|must be supported by (?:the )?{src}|"
1574
+ rf"policy claims must come from (?:the )?{src}|"
1575
+ rf"grounded strictly in (?:the )?{src})\b",
1576
+ m,
1577
+ )
1578
+ )
1579
+ relaxed = bool(
1580
+ re.search(
1581
+ rf"\b(faq plus general knowledge|mix (?:the )?{src} with general knowledge|"
1582
+ rf"relaxed faq (?:grounding|mode)|"
1583
+ rf"general knowledge (?:is )?ok.{0,50}(?:faq|excerpt|policy|documentation)|"
1584
+ rf"(?:faq|excerpt|policy|documentation).{0,50}general knowledge (?:is )?ok|"
1585
+ rf"supplement (?:the )?{src} with (?:brief )?general[- ]knowledge|"
1586
+ rf"beyond (?:the )?{src} you may add (?:brief )?general context)\b",
1587
+ m,
1588
+ )
1589
+ )
1590
+ if strict and relaxed:
1591
+ return None
1592
+ if strict:
1593
+ return "strict"
1594
+ if relaxed:
1595
+ return "relaxed"
1596
+ return None
1597
+
1598
+
1599
+ def _embedded_code_block_style(m: str) -> str | None:
1600
+ """``fenced`` vs ``inline`` code layout (not short *Use code fences* / *Inline code only* controls)."""
1601
+ if len(m) < 48:
1602
+ return None
1603
+ code_ctx = (
1604
+ r"\b(code|snippet|command|script|bash|shell|python|curl|kubectl|docker|sql|"
1605
+ r"regex|yaml|terraform|powershell|config|api call|terminal)\b"
1606
+ )
1607
+ fenced = bool(
1608
+ re.search(
1609
+ r"\b(use code fences|fenced code blocks?|markdown code fences?|"
1610
+ r"triple[- ]backtick fences?|put (?:the )?(?:code|commands?|script) in (?:a )?fenced block|"
1611
+ r"use markdown fenced code blocks?|wrap (?:the )?(?:code|snippet) in (?:triple )?backticks)\b",
1612
+ m,
1613
+ )
1614
+ )
1615
+ inline = bool(
1616
+ re.search(
1617
+ r"\b(inline code only|no triple backticks?|no fenced code blocks?|"
1618
+ r"avoid code fences|single backticks? only|don'?t use fenced blocks?|"
1619
+ r"keep (?:code|snippets?) inline|inline backticks? only)\b",
1620
+ m,
1621
+ )
1622
+ )
1623
+ if not fenced and not inline:
1624
+ return None
1625
+ if not re.search(code_ctx, m):
1626
+ return None
1627
+ if fenced and inline:
1628
+ return None
1629
+ if fenced:
1630
+ return "fenced"
1631
+ return "inline"
1632
+
1633
+
1634
+ def _embedded_reply_format(m: str) -> str | None:
1635
+ """``bullets`` vs ``prose`` list layout (not short *Use bullet points* / *No bullets* controls)."""
1636
+ if len(m) < 48:
1637
+ return None
1638
+ prose = bool(
1639
+ re.search(
1640
+ r"\b(no bullets?|plain paragraphs?|prose only|stop using lists|"
1641
+ r"continuous prose only|avoid bullet lists?|write in paragraphs|"
1642
+ r"paragraph form only|don'?t use bullet points?|"
1643
+ r"keep (?:it\s+)?in (?:flowing )?prose|not as a bulleted list)\b",
1644
+ m,
1645
+ )
1646
+ )
1647
+ bullets = bool(
1648
+ re.search(
1649
+ r"\b(bullet points?|bulleted list|use bullets|format as bullets|"
1650
+ r"list (?:the\s+)?key points in bullets|markdown bullets?|"
1651
+ r"give me a bulleted list|bullet(?:ed)? format)\b",
1652
+ m,
1653
+ )
1654
+ )
1655
+ if prose and bullets:
1656
+ return None
1657
+ if prose:
1658
+ return "prose"
1659
+ if bullets:
1660
+ return "bullets"
1661
+ return None
1662
+
1663
+
1664
+ def _embedded_comparison_frame(m: str) -> str | None:
1665
+ """``pros_cons`` vs ``narrative`` comparison layout (not short *Use pros and cons* controls)."""
1666
+ if len(m) < 48:
1667
+ return None
1668
+ if not re.search(r"\b(compare|comparing|comparison|contrasted?|contrast|trade-?offs?)\b", m):
1669
+ return None
1670
+ narrative = bool(
1671
+ re.search(
1672
+ r"\b(no pros|without pros|avoid pros|no pros\/cons|no pros and cons sections?)\b",
1673
+ m,
1674
+ )
1675
+ or re.search(
1676
+ r"\b(flowing prose|continuous prose|narrative comparison|prose comparison only|"
1677
+ r"compare in flowing prose)\b",
1678
+ m,
1679
+ )
1680
+ )
1681
+ pros = bool(
1682
+ re.search(
1683
+ r"\b(trade-?offs?|(?<!no )pros and cons|advantages and disadvantages)\b",
1684
+ m,
1685
+ )
1686
+ or re.search(r"\bdifference between\b.+\band\b", m)
1687
+ or re.search(r"\b(compare|comparing|comparison|contrasted?|contrast)\b.+\b(vs\.?|versus)\b", m)
1688
+ or (
1689
+ re.search(r"\b(compare|comparing|comparison)\b", m)
1690
+ and re.search(r"\b(and|with)\b", m)
1691
+ and len(m) >= 72
1692
+ and re.search(
1693
+ r"\b(versus|vs\.?|option|approach|tool|stack|framework|language|model|database|db|cloud)\b",
1694
+ m,
1695
+ )
1696
+ )
1697
+ )
1698
+ if narrative and pros:
1699
+ return None
1700
+ if narrative:
1701
+ return "narrative"
1702
+ if pros:
1703
+ return "pros_cons"
1704
+ return None
1705
+
1706
+
1707
+ def _embedded_step_style(m: str) -> str | None:
1708
+ """``numbered`` vs ``continuous`` procedure layout (not short *Step by step* controls)."""
1709
+ if len(m) < 48:
1710
+ return None
1711
+ continuous = bool(
1712
+ re.search(
1713
+ r"\b(no numbered steps|don'?t number steps|skip step numbers|"
1714
+ r"prose without steps|avoid numbered step lists?|"
1715
+ r"continuous prose (?:only|instead)|not as numbered steps|"
1716
+ r"connected paragraphs?(?:\s+only)?|explain as (?:flowing )?prose)\b",
1717
+ m,
1718
+ )
1719
+ )
1720
+ numbered = bool(
1721
+ re.search(r"\b(step by step|step-by-step)\b", m)
1722
+ or re.search(r"\b(walk me through|show me how)\b", m)
1723
+ or re.search(r"\b(?<!no )(?:use )?numbered steps\b|break it into steps\b", m)
1724
+ or (
1725
+ re.search(r"\b(how do i|how can i|how should i|how to)\b", m)
1726
+ and re.search(
1727
+ r"\b(install|set up|setup|configure|enable|deploy|migrate|upgrade|fix|debug|troubleshoot)\b",
1728
+ m,
1729
+ )
1730
+ )
1731
+ )
1732
+ if continuous and numbered:
1733
+ return None
1734
+ if continuous:
1735
+ return "continuous"
1736
+ if numbered:
1737
+ return "numbered"
1738
+ return None
1739
+
1740
+
1741
+ def _reply_lang_phrase(m: str) -> str | None:
1742
+ """Return display name (e.g. 'French') if the user asked for a reply in a known language."""
1743
+ for mo in re.finditer(
1744
+ r"\b(respond|answer|reply|write|explain)\s+(?:in|using)\s+([a-z]{3,20})\b(?:\s*[.?!]|$|,|\s+please|\s+thanks)?",
1745
+ m,
1746
+ ):
1747
+ tok = mo.group(2)
1748
+ if tok in _REPLY_LANG_TOKENS:
1749
+ return _REPLY_LANG_TOKENS[tok]
1750
+ mo = re.search(
1751
+ r"\b(translate|translating)\s+(?:this|that|it|your answer|the above|my text)\s+(?:to|into)\s+([a-z]{3,20})\b",
1752
+ m,
1753
+ )
1754
+ if mo and mo.group(2) in _REPLY_LANG_TOKENS:
1755
+ return _REPLY_LANG_TOKENS[mo.group(2)]
1756
+ mo = re.search(r"\b(entire reply|whole answer|full answer)\s+(?:in|using)\s+([a-z]{3,20})\b", m)
1757
+ if mo and mo.group(2) in _REPLY_LANG_TOKENS:
1758
+ return _REPLY_LANG_TOKENS[mo.group(2)]
1759
+ # Trailing clause: "... in french" / "... in spanish, please"
1760
+ tail = m[-100:] if len(m) > 100 else m
1761
+ mo = re.search(r"\b(in|into)\s+([a-z]{3,20})\s*(?:[,.]|please|thanks)?\s*$", tail)
1762
+ if mo and mo.group(2) in _REPLY_LANG_TOKENS:
1763
+ return _REPLY_LANG_TOKENS[mo.group(2)]
1764
+ return None
1765
+
1766
+
1767
+ def analyze_embedded_prompt_signals(message: str) -> tuple[dict[str, str], list[str], list[str]]:
1768
+ """Infer reply-style preferences from wording inside longer questions (one-shot overlays).
1769
+
1770
+ Used only when ``parse_control_action`` does not treat the line as a dedicated control
1771
+ command. Conservative patterns avoid hijacking short chit-chat.
1772
+
1773
+ Returns:
1774
+ (field_overrides, extra_system_paragraphs, trace_tags) — overrides use the same keys/values as
1775
+ ``ub_session`` reply-style fields; extra paragraphs are appended as separate system sections;
1776
+ ``trace_tags`` are short tokens for the brain-trace ``prompt_signals:`` line (e.g. ``language``,
1777
+ ``code_only``, ``len_cap=80w``, ``guided``, ``ephemeral``, ``a11y``). Session-style overrides
1778
+ (e.g. ``confidence_tone=transparent``) appear as ``key=value`` tokens in the same line.
1779
+ """
1780
+ m = _norm(message)
1781
+ overrides: dict[str, str] = {}
1782
+ extras: list[str] = []
1783
+ trace_tags: list[str] = []
1784
+
1785
+ if len(m) >= 24:
1786
+ lang = _reply_lang_phrase(m)
1787
+ if lang:
1788
+ trace_tags.append("language")
1789
+ extras.append(
1790
+ f"The user asked (via natural wording) for the assistant reply in **{lang}**. "
1791
+ f"Write the **entire** answer in {lang}, including headings and lists, unless a quoted passage must stay "
1792
+ "verbatim in another language."
1793
+ )
1794
+
1795
+ co = _code_only_instruction(m)
1796
+ if co:
1797
+ trace_tags.append("code_only")
1798
+ extras.append(co)
1799
+
1800
+ lc = _length_cap_instruction(m)
1801
+ if lc:
1802
+ extras.append(lc[0])
1803
+ trace_tags.append(lc[1])
1804
+
1805
+ gd = _guided_discovery_instruction(m)
1806
+ if gd:
1807
+ extras.append(gd[0])
1808
+ trace_tags.append(gd[1])
1809
+
1810
+ ep = _ephemeral_privacy_instruction(m)
1811
+ if ep:
1812
+ extras.append(ep[0])
1813
+ trace_tags.append(ep[1])
1814
+
1815
+ ax = _accessibility_sr_instruction(m)
1816
+ if ax:
1817
+ extras.append(ax[0])
1818
+ trace_tags.append(ax[1])
1819
+
1820
+ if _embedded_simple_audience(m):
1821
+ overrides["audience"] = "simple"
1822
+ elif _embedded_technical_audience(m):
1823
+ overrides["audience"] = "technical"
1824
+
1825
+ ert = _embedded_register_tone(m)
1826
+ if ert:
1827
+ overrides["register_tone"] = ert
1828
+
1829
+ if _embedded_json_output(m):
1830
+ overrides["output_format"] = "json"
1831
+
1832
+ spc = _embedded_speculation(m)
1833
+ if spc:
1834
+ overrides["speculation"] = spc
1835
+
1836
+ ald = _embedded_answer_lead(m)
1837
+ if ald:
1838
+ overrides["answer_lead"] = ald
1839
+
1840
+ act = _embedded_actionability(m)
1841
+ if act:
1842
+ overrides["actionability"] = act
1843
+
1844
+ cft = _embedded_confidence_tone(m)
1845
+ if cft:
1846
+ overrides["confidence_tone"] = cft
1847
+
1848
+ exd = _embedded_example_density(m)
1849
+ if exd:
1850
+ overrides["example_density"] = exd
1851
+
1852
+ if len(m) < 48:
1853
+ return overrides, extras, trace_tags
1854
+
1855
+ cmf = _embedded_comparison_frame(m)
1856
+ if cmf:
1857
+ overrides["comparison_frame"] = cmf
1858
+
1859
+ stl = _embedded_step_style(m)
1860
+ if stl:
1861
+ overrides["step_style"] = stl
1862
+
1863
+ # Tables when the user names the shape they want.
1864
+ if re.search(r"\b(no tables?|without a table|avoid tables?)\b", m):
1865
+ overrides["table_style"] = "avoid"
1866
+ elif re.search(
1867
+ r"\b(in a table|as a table|markdown table|tabular format|two-?column|rows and columns)\b",
1868
+ m,
1869
+ ):
1870
+ overrides["table_style"] = "prefer"
1871
+
1872
+ rpf = _embedded_reply_format(m)
1873
+ if rpf:
1874
+ overrides["reply_format"] = rpf
1875
+
1876
+ mth = _embedded_math_detail(m)
1877
+ if mth:
1878
+ overrides["math_detail"] = mth
1879
+
1880
+ cpt = _embedded_counterpoint_tone(m)
1881
+ if cpt:
1882
+ overrides["counterpoint_tone"] = cpt
1883
+
1884
+ eord = _embedded_exposition_order(m)
1885
+ if eord:
1886
+ overrides["exposition_order"] = eord
1887
+
1888
+ fuc = _embedded_followup_close(m)
1889
+ if fuc:
1890
+ overrides["followup_close"] = fuc
1891
+
1892
+ clf = _embedded_clarify_first(m)
1893
+ if clf:
1894
+ overrides["clarify_first"] = clf
1895
+
1896
+ shd = _embedded_section_headings(m)
1897
+ if shd:
1898
+ overrides["section_headings"] = shd
1899
+
1900
+ anu = _embedded_analogy_use(m)
1901
+ if anu:
1902
+ overrides["analogy_use"] = anu
1903
+
1904
+ tem = _embedded_term_emphasis(m)
1905
+ if tem:
1906
+ overrides["term_emphasis"] = tem
1907
+
1908
+ acs = _embedded_acronym_style(m)
1909
+ if acs:
1910
+ overrides["acronym_style"] = acs
1911
+
1912
+ rsk = _embedded_risk_posture(m)
1913
+ if rsk:
1914
+ overrides["risk_posture"] = rsk
1915
+
1916
+ qst = _embedded_quote_style(m)
1917
+ if qst:
1918
+ overrides["quote_style"] = qst
1919
+
1920
+ emj = _embedded_emoji_style(m)
1921
+ if emj:
1922
+ overrides["emoji_style"] = emj
1923
+
1924
+ fgr = _embedded_faq_grounding(m)
1925
+ if fgr:
1926
+ overrides["faq_grounding"] = fgr
1927
+
1928
+ cbs = _embedded_code_block_style(m)
1929
+ if cbs:
1930
+ overrides["code_block_style"] = cbs
1931
+
1932
+ return overrides, extras, trace_tags
1933
+
scripts/universal_brain_chat.py CHANGED
@@ -104,14 +104,15 @@ from horizon3_store import ( # noqa: E402
104
  from google_cse_client import ( # noqa: E402
105
  format_cse_hits_markdown,
106
  google_cse_search,
 
107
  read_google_cse_settings,
108
  )
109
- from nl_controls import parse_control_action # noqa: E402
110
  from rag_faq_smoke import _pick_model, hybrid_retrieve, load_chunks # noqa: E402
111
  from tinymodel_runtime import TinyModelRuntime # noqa: E402
112
 
113
  HELP_TEXT = """**How to use**
114
- - **Normal language:** ask in plain English (or mixed); the app **infers** what you want (summarize, search FAQ, save a note, etc.).
115
  - **Session controls (say it in chat, no slash command):**
116
  - *What is my current scope?*, *Show my session settings* -> prints scope + toggles (FAQ context, routing, trace)
117
  - *Start a new private session*, *Begin a fresh scope* -> generates a **new memory scope key** so notes are isolated from the shared default demo scope
@@ -157,6 +158,7 @@ HELP_TEXT = """**How to use**
157
  - **Rewrite** professionally / rephrase
158
  - **Answer using only** these facts — include both facts and question
159
  - **Search** the FAQ / **find** in the knowledge base
 
160
  - **Classify** (topic model) this paragraph
161
  - **Similarity:** are these two snippets close in meaning? (encoder cosine)
162
  - **Embedding** stats for a passage (dimension, norm, preview)
@@ -172,7 +174,18 @@ If routing misfires, try rephrasing or use a slash command; **`--no-smart-route`
172
  # Shown under the chat + controls in the Gradio UI (Hugging Face Space and local).
173
  GRADIO_INSTRUCTIONS_MARKDOWN = """### About this Space
174
 
175
- **Universal Brain** pairs a small **generative** model with the **TinyModel1** encoder (AG News–style topics: World, Business, Sports, Sci/Tech), **FAQ retrieval** over a bundled corpus, **SQLite memory** scoped per session, and **natural-language routing** so many tasks work without slash commands. First CPU startup can take a few minutes while weights download.
 
 
 
 
 
 
 
 
 
 
 
176
 
177
  ---
178
 
@@ -184,6 +197,55 @@ GRADIO_INSTRUCTIONS_MARKDOWN = """### About this Space
184
 
185
  ---
186
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
187
  ### What to try (step-by-step)
188
 
189
  | Goal | What to type |
@@ -248,6 +310,7 @@ If you see an error about HTTP 403 or “API key not valid”, fix the key or en
248
  - Ensure **smart routing** is on (say *Turn on smart routing* if you turned it off).
249
  - Ask in plain language for **live web** / **Google** / **today’s** information, e.g. *Search the web for the latest SpaceX launch summary* or *What does the web say about …?*
250
  - The router uses intent **`web_search`**: the app fetches snippets, injects them into the model context, then the assistant replies **using those sources** (cite **[Web n]** when using a snippet).
 
251
  - If the model stays in FAQ-only mode, use **`/web …`** first to confirm the API works, then try clearer web phrasing.
252
 
253
  **6) Brain trace**
@@ -328,6 +391,7 @@ Rules:
328
  - Default to "chat" when unsure; copy the entire user message into "text".
329
  - Do not invent facts for "grounded": if no clear facts/context, use "chat" instead.
330
  - Use **retrieve** for bundled FAQ / help-base search; use **web_search** when the user clearly needs the **public web** (today, external site, breaking news, "google this", etc.).
 
331
  - Extract minimal "text" for tool intents (do not repeat system chatter)."""
332
 
333
  VALID_INTENTS = frozenset(
@@ -1224,6 +1288,10 @@ def _append_reply_style_hints(extras: list[str], session: dict[str, Any]) -> Non
1224
  lines.append("Prefer fuller, well-structured explanations when they help the user.")
1225
  if rformat == "bullets":
1226
  lines.append("When listing multiple points, use markdown bullet or numbered lists.")
 
 
 
 
1227
  audience = str(session.get("audience") or "normal").lower()
1228
  if audience not in ("simple", "normal", "technical"):
1229
  audience = "normal"
@@ -1773,6 +1841,11 @@ def parse_args() -> argparse.Namespace:
1773
  action="store_true",
1774
  help="Disable NL intent routing (plain chat only; slash commands still work).",
1775
  )
 
 
 
 
 
1776
  p.add_argument(
1777
  "--router-max-new-tokens",
1778
  type=int,
@@ -1786,6 +1859,8 @@ def parse_args() -> argparse.Namespace:
1786
  def main() -> None:
1787
  args = parse_args()
1788
  _load_dotenv_if_present(_REPO)
 
 
1789
  _gk, _gc, _, _ = read_google_cse_settings()
1790
  cse_on = bool(_gk and _gc)
1791
  _ensure_gradio_can_reach_localhost()
@@ -1968,10 +2043,28 @@ def main() -> None:
1968
  except Exception:
1969
  route = {"intent": "chat", "text": msg, "question": "", "context": ""}
1970
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1971
  if route["intent"] == "web_search":
1972
  g_key, g_cx, g_num, g_safe = read_google_cse_settings()
1973
  q_web = (route["text"] or msg).strip()
1974
- web_trace = "web:CSE:cfg"
 
1975
  if g_key and g_cx and q_web:
1976
  try:
1977
  hits = google_cse_search(
@@ -1982,16 +2075,16 @@ def main() -> None:
1982
  safe=g_safe,
1983
  )
1984
  web_block = format_cse_hits_markdown(hits, for_chat=True)
1985
- web_trace = f"web:CSE:{len(hits)}"
1986
  except Exception as ex:
1987
  web_block = (
1988
  f"(Google web search failed: {_clip(str(ex), 500)})\n\n"
1989
  "Answer from general knowledge where appropriate; do not invent URLs or page titles."
1990
  )
1991
- web_trace = "web:CSE:err"
1992
  elif not q_web:
1993
  web_block = "(Empty web search query. Ask again with a concrete search topic.)"
1994
- web_trace = "web:CSE:empty"
1995
  else:
1996
  web_block = (
1997
  "(Web search is not configured: set **GOOGLE_CSE_API_KEY** and **GOOGLE_CSE_CX** "
@@ -2024,9 +2117,19 @@ def main() -> None:
2024
 
2025
  chat_line = route["text"] or msg
2026
 
 
 
 
2027
  trace: list[str] = []
 
 
 
 
 
2028
  extras: list[str] = []
2029
- _append_reply_style_hints(extras, ub_session)
 
 
2030
  if web_trace:
2031
  trace.append(web_trace)
2032
 
@@ -2096,6 +2199,7 @@ def main() -> None:
2096
  or mem_conn is not None
2097
  or effective_rag is not None
2098
  or bool(web_trace)
 
2099
  )
2100
  )
2101
  if show_trace_footer and trace:
 
104
  from google_cse_client import ( # noqa: E402
105
  format_cse_hits_markdown,
106
  google_cse_search,
107
+ heuristic_suggests_web_search,
108
  read_google_cse_settings,
109
  )
110
+ from nl_controls import analyze_embedded_prompt_signals, parse_control_action # noqa: E402
111
  from rag_faq_smoke import _pick_model, hybrid_retrieve, load_chunks # noqa: E402
112
  from tinymodel_runtime import TinyModelRuntime # noqa: E402
113
 
114
  HELP_TEXT = """**How to use**
115
+ - **Normal language:** ask in plain English (or mixed); the app **infers** what you want (summarize, search FAQ, save a note, etc.). Longer prompts may also **imply** reply shape for that turn only (for example trade-off questions → Pros/Cons layout or flowing prose comparison, “in a table” → markdown table preference, “answer in Spanish” → reply language, **code only** → code-first output, **in under N words** → length cap, **hints only / don’t give the full solution** → guided discovery, **red team / sanity check my plan** → challenge-style pushback, **be supportive / assume good intent on my plan** → supportive coaching, **don’t remember this / off the record** → ephemeral hint, **screen reader friendly / WCAG** → accessibility layout hint, **ELI5 / lay audience in a long question** → beginner audience, **assume I'm technical / expert depth in prose** → technical audience, **board-ready / Slack-casual wording** → formal or casual register, **valid JSON / return JSON in prose** → JSON output mode, **don’t guess / stick to facts in prose** → strict speculation, **brainstorm freely / wild ideas in prose** → creative speculation, **TLDR first / BLUF in prose** → summary-first open, **answer directly / skip the summary in prose** → direct opening, **FAQ direct quotes vs paraphrase-only in prose** → quote style for excerpts, **emoji ok vs no emoji in prose** → emoji style, **FAQ-only vs FAQ-plus-general-knowledge in prose** → FAQ grounding, **show work vs final-answer-only in prose** → math detailing, **state assumptions / limitations / caveats** in prose → transparent confidence tone, **be decisive / don’t hedge in prose** → assertive confidence tone, **curl/bash/kubectl in prose** → runnable commands, **conceptual only / no commands in prose** → conceptual actionability, **bullet points vs plain paragraphs in prose** → reply format, **step-by-step vs continuous procedure prose in long prompts** → step style, **concrete / worked / toy example in prose** → richer examples, **example-free / skip examples in prose** → sparser examples, **define terms first / intuition or big-picture first in prose** → explanation order, **no questions at the end / suggest next steps in prose** → closing style, **ask questions before answering / answer without clarifiers in prose** → clarify-first mode, **markdown section headings vs flat prose in long prompts** → section layout, **analogy vs literal-only in long prompts** → analogy style, **bold key terms vs minimal bold in long prompts** → term emphasis, **spell out acronyms vs terse acronyms in long prompts** → acronym style, **err on the side of safety vs ship-fast pragmatism in long prompts** → risk posture, **fenced code blocks vs inline-only snippets in long prompts** → code block style) — see *Brain trace* **`prompt_signals:`** when detected.
116
  - **Session controls (say it in chat, no slash command):**
117
  - *What is my current scope?*, *Show my session settings* -> prints scope + toggles (FAQ context, routing, trace)
118
  - *Start a new private session*, *Begin a fresh scope* -> generates a **new memory scope key** so notes are isolated from the shared default demo scope
 
158
  - **Rewrite** professionally / rephrase
159
  - **Answer using only** these facts — include both facts and question
160
  - **Search** the FAQ / **find** in the knowledge base
161
+ - **Live web** (news, prices, “latest …”, fact-checking) — router uses **web_search**; with Google CSE configured, the server may also **auto-run** web search when your wording implies it (see brain trace **`+auto`**). Disable with **`--no-auto-web`** or env **`NO_AUTO_WEB=1`** on your own deployment.
162
  - **Classify** (topic model) this paragraph
163
  - **Similarity:** are these two snippets close in meaning? (encoder cosine)
164
  - **Embedding** stats for a passage (dimension, norm, preview)
 
174
  # Shown under the chat + controls in the Gradio UI (Hugging Face Space and local).
175
  GRADIO_INSTRUCTIONS_MARKDOWN = """### About this Space
176
 
177
+ **Universal Brain** is a **text** assistant built from: (1) a **small generative instruct model** (default in code: **SmolLM2-360M-Instruct**, overridable via **`HORIZON2_MODEL`**), (2) the **TinyModel1** encoder for **4-way topics** (World, Business, Sports, Sci/Tech) plus **embeddings**, (3) **FAQ hybrid retrieval** over a bundled corpus, (4) **SQLite memory** per session scope, (5) a **JSON intent router** so many tasks work in plain language, (6) optional **Google web search** when `GOOGLE_CSE_API_KEY` + `GOOGLE_CSE_CX` are set (`/web`, smart-route **web search**, optional **auto-web**). First CPU startup can take a few minutes while weights download.
178
+
179
+ #### What it can do (summary)
180
+
181
+ | Area | Capacity |
182
+ | --- | --- |
183
+ | **Chat & tools** | Summarize, rewrite, grounded Q&A (`|||` facts), FAQ search, **live web** (if configured), classify, similarity, embeddings, nearest-option, **/status**, memory CRUD — via natural language or **`/…`** shortcuts. |
184
+ | **Encoder** | Soft **topic hint** + trace line **`classify:…`**; **`/classify`** for full label probabilities. |
185
+ | **RAG** | Injects top FAQ **chunks**; tune strictness with phrases like *Strict FAQ* (see `/help`). |
186
+ | **Memory** | Long-term + session notes; **scope** isolation phrases for demos; export / forget from chat. |
187
+ | **Style (long prompts)** | **Embedded prompt signals** adjust **one turn** from wording (tradeoffs, tables, steps, language, code-only, caps, hints-only, red-team, ephemeral, a11y, ELI5-style, register, JSON, strict facts, BLUF, shell commands, assumptions, examples, teach-order, closers, clarify-first, headings, analogies) — see **`prompt_signals:`** in the *Brain trace* and the table **Testing embedded prompt signals** below. |
188
+ | **Limits** | Small models can **hallucinate** or miss nuance; FAQ/web only **constrain** answers when relevant snippets exist. **Not multimodal** here. Shared default **memory scope** is not private auth. |
189
 
190
  ---
191
 
 
197
 
198
  ---
199
 
200
+ ### Testing embedded prompt signals (this Space)
201
+
202
+ These behaviors apply when your line is handled as **normal chat** (not a short dedicated control like *Be brief*). The app scans your wording and adds **one-turn** system hints. Say **Show the brain trace** first, send a message, then scroll to the bottom of the assistant reply.
203
+
204
+ | Goal | What to type (examples) | What to look for |
205
+ | --- | --- | --- |
206
+ | Comparison: pros/cons | In a **long** message, ask for **tradeoffs**, **pros and cons**, **compare X vs Y**, or **advantages and disadvantages** between concrete options (avoid mixing with **no pros and cons** / **flowing prose comparison** in the same line). | **`comparison_frame=pros_cons`** in **`prompt_signals:`**; reply should use **Pros** / **Cons** sections |
207
+ | Comparison: narrative prose | In a **long** comparison question, ask for **flowing prose**, **narrative comparison**, **prose comparison only**, or **no pros and cons sections** (avoid mixing with **pros and cons** / **tradeoffs** layout cues in the same line). | **`comparison_frame=narrative`** in **`prompt_signals:`**; reply should weave the comparison in continuous prose |
208
+ | Length cap | End your question with **in under 80 words** or **at most 3 sentences**. | **`len_cap=80w`** or **`len_cap=3s`** in **`prompt_signals:`**; the model should stay near that cap |
209
+ | Code-only | Ask for a tiny snippet and add **code only, no explanation** (or **just the code**). | **`code_only`** in **`prompt_signals:`**; reply should be mostly a fenced code block |
210
+ | Tables / language | Ask for a summary **in a markdown table**, or **answer in spanish** in the same line as your question. | **`table_style=prefer`** and/or **`language`** in **`prompt_signals:`** |
211
+ | Numbered steps vs continuous prose | In a **long** how-to message, ask **step by step**, **walk me through**, **numbered steps**, or a **how to install/configure** style question. **Or** say **no numbered steps**, **continuous prose only**, **prose without steps**, **explain as connected paragraphs** (avoid mixing both in one line). | **`step_style=numbered`** or **`step_style=continuous`** in **`prompt_signals:`**; reply should use numbered steps or flowing prose accordingly |
212
+ | Bullets vs prose | In a **long** message, ask for **bullet points**, **use bullets**, **bulleted list**, **format as bullets**, etc. **Or** say **no bullets**, **plain paragraphs**, **prose only**, **avoid bullet lists** (avoid mixing both in one line). | **`reply_format=bullets`** or **`reply_format=prose`** in **`prompt_signals:`**; reply should list points or stay in paragraphs accordingly |
213
+ | Guided discovery (hints / Socratic) | Ask a **how / why** question and say you want **hints only** or **don’t give me the full solution yet** (keep the message substantive, not a one-liner control). | **`guided`** in **`prompt_signals:`**; first reply should skew toward questions and nudges |
214
+ | Red-team / critique | In one paragraph, describe a **plan or design** and ask for a **red team**, **sanity check**, **what am I missing**, or **devil’s advocate** review (not a one-line control). | **`counterpoint_tone=challenge`** inside **`prompt_signals:`**; reply should stress-test assumptions |
215
+ | Supportive coaching | In one paragraph, describe a **plan, pitch, or idea** and ask to **be supportive**, **assume good intent**, **encourage my proposal**, **gentle feedback**, or **avoid harsh criticism** (not a one-line control; avoid mixing with red-team wording in the same line). | **`counterpoint_tone=supportive`** in **`prompt_signals:`**; reply should coach with constructive next steps, not harsh critique |
216
+ | Ephemeral / no memory | Say **off the record**, **don’t remember this**, **no memory for this**, or **don’t log this** in the same message as your question (demo: shared Space scopes are not true secrecy). | **`ephemeral`** in **`prompt_signals:`**; assistant should avoid pushing `/remember` for that content |
217
+ | Accessibility / screen readers | Ask for a **screen reader friendly** or **WCAG-aware** answer, or say the write-up is **for blind readers** / **for NVDA users** in a full sentence (not a one-word ping). | **`a11y`** in **`prompt_signals:`**; reply should favor linear structure, headings, and non-table-only facts |
218
+ | Beginner / ELI5 in context | In a **longer** question (not a one-line control), ask for **ELI5**, **explain like I'm five**, **total beginner**, **lay audience**, **no technical background**, etc., plus a normal **what/why/how** ask. | **`audience=simple`** in **`prompt_signals:`**; reply should use plain language and minimal jargon |
219
+ | Technical / expert audience | In a **longer** question (not a one-line control), say you're a **technical audience**, **assume I'm technical**, want a **deep technical** or **internals-focused** explanation, **skip the basics**, **staff-engineer level**, etc., plus a normal **what/why/how** ask (avoid mixing with ELI5/beginner wording in the same line). | **`audience=technical`** in **`prompt_signals:`**; reply may use domain jargon and skip hand-holding |
220
+ | Formal vs casual register | Ask for a **board-ready** / **client-facing** / **formal memo** / **for regulators** write-up, **or** say you want a **Slack message**, **keep it casual**, **water cooler** tone (one dominant style per message). | **`register_tone=formal`** or **`register_tone=casual`** in **`prompt_signals:`** |
221
+ | JSON / structured output | In a **long** message, ask for **valid JSON**, **return JSON**, **as a JSON object**, **machine-readable JSON**, etc. (say **no json** / **plain text only** if you want to opt out). | **`output_format=json`** in **`prompt_signals:`**; reply should be parseable JSON when practical |
222
+ | Strict facts / low speculation | In a **long** message, ask to **not guess**, **avoid hallucinations**, **only high confidence**, **stick to facts**, **if unsure say so**, etc. (avoid mixing with **brainstorm freely** in the same line). | **`speculation=strict`** in **`prompt_signals:`**; reply should label uncertainty clearly |
223
+ | Creative brainstorming | In a **long** message, ask to **brainstorm freely**, **speculate freely**, welcome **wild ideas**, do **blue-sky thinking**, or **explore hypotheticals** (avoid mixing with **don’t guess** / **stick to facts** in the same line). | **`speculation=creative`** in **`prompt_signals:`**; reply may propose speculative ideas with clear assumption labels |
224
+ | Summary / BLUF first | In a **long** message, ask to **TLDR first**, **lead with a one-line summary**, **bottom line up front**, **BLUF**, **executive summary first**, etc. (avoid mixing with **answer directly** / **skip the summary** in the same line). | **`answer_lead=tldr_first`** in **`prompt_signals:`**; reply should open with a short summary line |
225
+ | Direct answer (no TL;DR) | In a **long** message, ask to **answer directly**, **skip the summary**, **no TL;DR**, **jump straight to the answer**, or **omit the opening summary** (avoid mixing with **BLUF** / **summary first** in the same line). | **`answer_lead=direct`** in **`prompt_signals:`**; reply should start in-flow without a standalone TL;DR prelude |
226
+ | Runnable commands | In a **long** message, ask for **curl one-liner**, **bash snippet**, **kubectl**, **copy-paste into terminal**, **docker run example**, etc. (avoid mixing with **conceptual only** / **no commands** in the same line). | **`actionability=commands`** in **`prompt_signals:`**; reply should include concrete commands where sensible |
227
+ | Conceptual only (no commands) | In a **long** message, ask for **conceptual only**, **high level only**, **no shell commands**, **focus on concepts and rationale**, or an **architecture overview without command dumps** (avoid mixing with **kubectl** / **copy-paste into terminal** in the same line). | **`actionability=conceptual`** in **`prompt_signals:`**; reply should avoid runnable command dumps |
228
+ | Assumptions / limitations | In a **long** message, ask to **state your assumptions**, **assumptions and limitations**, **caveats upfront**, **scope and assumptions**, **what we are assuming**, or to **flag key uncertainties** (say **skip assumptions** to opt out; avoid mixing with **be decisive** in the same line). | **`confidence_tone=transparent`** in **`prompt_signals:`**; reply should surface assumptions, limits, and uncertainty clearly |
229
+ | Decisive / confident tone | In a **long** message, ask to **be decisive**, **don’t hedge**, **give firm answers**, **sound confident**, or **avoid disclaimers** (avoid mixing with **state your assumptions** / **caveats upfront** in the same line). | **`confidence_tone=assertive`** in **`prompt_signals:`**; reply should be direct with minimal hedging |
230
+ | Concrete examples vs example-free | In a **long** message, ask for a **worked example**, **walk me through a toy example**, **illustrate with a concrete example**, **ground your answer in an example**, etc. **Or** ask to **skip examples**, **theory only**, **keep it abstract**, **example-free** (avoid mixing both in one line). | **`example_density=rich`** or **`example_density=sparse`** in **`prompt_signals:`**; reply should include or omit short illustrative examples accordingly |
231
+ | Explanation order | In a **long** message, ask to **define terms first**, **definitions before details**, **formal definitions upfront**, **terminology first**, etc. **Or** ask for **intuition before math**, **big picture first**, **motivation before the formal proof**, **start with the high-level sketch** (avoid asking for both orders in one line). | **`exposition_order=definitions_first`** or **`exposition_order=intuition_first`** in **`prompt_signals:`**; reply should lead with definitions or with intuition accordingly |
232
+ | Closing / follow-ups | In a **long** message, ask for **no questions at the end**, **don’t ask if I need anything else**, **finish crisply**, **skip the stock closer**, etc. **Or** ask to **suggest next steps**, **end with actionable next steps**, **what should we do next**, **offer ways to go deeper** (avoid mixing both in one line). | **`followup_close=minimal`** or **`followup_close=suggest`** in **`prompt_signals:`**; reply should omit or include a light optional follow-up line accordingly |
233
+ | Clarify-first vs answer-first | In a **long** message, ask to **ask clarifying questions before you answer**, **if anything is unclear ask me first**, **confirm my constraints before**, etc. **Or** say **no clarifying questions**, **answer without asking questions first**, **don’t interrogate me first**, **give your best answer without asking** (avoid mixing both in one line). | **`clarify_first=on`** or **`clarify_first=off`** in **`prompt_signals:`**; first reply should ask brief questions first or answer directly |
234
+ | Section headings vs flat | In a **long** message, ask to **use markdown headings**, **organize with headings**, **structure the answer with clear headings**, **h2 or h3 headings for each topic**, etc. **Or** ask for a **flat answer**, **no section headings**, **avoid markdown headings**, **continuous prose only** (avoid mixing both in one line). | **`section_headings=prefer`** or **`section_headings=avoid`** in **`prompt_signals:`**; reply should use or avoid `##` / `###` title lines accordingly |
235
+ | Analogies vs literal | In a **long** message, ask to **use a helpful analogy**, **explain with a simple analogy**, **liken this to something familiar**, **map it to an everyday example**, etc. **Or** say **no analogies**, **skip metaphors**, **literal explanations only**, **stick to literal technical description** (avoid mixing both in one line). | **`analogy_use=prefer`** or **`analogy_use=avoid`** in **`prompt_signals:`**; reply may include one tight analogy or stay metaphor-free accordingly |
236
+ | Bold key terms vs minimal bold | In a **long** message, ask to **bold the key terms**, **highlight important phrases**, **make key terms stand out** for scanning, etc. **Or** say **minimal bold**, **don’t overuse bold**, **avoid excessive bold**, **sparse bold** (avoid mixing both in one line). | **`term_emphasis=highlight`** or **`term_emphasis=minimal`** in **`prompt_signals:`**; reply should use selective **bold** on keywords or keep bold sparse |
237
+ | Acronym expansion vs terse | In a **long** message, ask to **spell out acronyms**, **expand acronyms on first use**, **define acronyms when you introduce them** (e.g. for compliance readers). **Or** say **assume I know acronyms**, **don’t expand acronyms**, **keep acronyms as-is**, **acronym-literate audience** (avoid mixing both in one line). | **`acronym_style=spell_out`** or **`acronym_style=terse`** in **`prompt_signals:`**; reply should expand once as `Long Form (ACRONYM)` or reuse acronyms without expansion |
238
+ | Risk posture (safe vs pragmatic) | In a **long** message, ask to **err on the side of safety**, **minimize downside**, **prefer low-risk options**, **safety-first rollout**, etc. **Or** say **optimize for speed**, **be pragmatic**, **avoid over-engineering**, **good enough is fine**, **ship fast** (avoid mixing both in one line). | **`risk_posture=conservative`** or **`risk_posture=pragmatic`** in **`prompt_signals:`**; recommendations should favor safety or practical speed accordingly |
239
+ | FAQ quote vs paraphrase | In a **long** message about **FAQ / policy / excerpt** text, ask to **quote the FAQ excerpts**, **include direct quotes from the policy**, **verbatim passages from the excerpt**, etc. **Or** say **paraphrase the FAQ**, **paraphrase only**, **don’t quote the excerpts**, **summarize the policy in your own words** (avoid mixing both in one line). | **`quote_style=quote`** or **`quote_style=paraphrase`** in **`prompt_signals:`**; reply should quote or paraphrase injected excerpts accordingly |
240
+ | Emoji in replies | In a **long** message, ask to **use a few tasteful emoji**, **include emoji when helpful**, **emoji are ok**, **sprinkle emoji**, etc. **Or** say **no emoji in your reply**, **avoid emoji**, **emoji-free tone**, **don’t use emoji** (avoid mixing both in one line). | **`emoji_style=include`** or **`emoji_style=avoid`** in **`prompt_signals:`**; reply may use sparse emoji or stay emoji-free accordingly |
241
+ | FAQ grounding (strict vs relaxed) | In a **long** message about **FAQ / policy / excerpt** retrieval, ask to **stick to the FAQ**, **only use the FAQ excerpts**, **if it’s not in the FAQ say so**, **strict FAQ grounding**, etc. **Or** say **FAQ plus general knowledge**, **mix the FAQ with general knowledge**, **supplement the excerpts with brief general context** (avoid mixing both in one line). | **`faq_grounding=strict`** or **`faq_grounding=relaxed`** in **`prompt_signals:`**; reply should stay FAQ-only or allow separated general context accordingly |
242
+ | Math steps vs final only | In a **long** math-style question, ask to **show your work**, **walk through the derivation**, **prove it step by step**, **show intermediate steps**, etc. **Or** say **final answer only**, **no derivation**, **skip the steps**, **just the result** for the equation (avoid mixing both in one line). | **`math_detail=show_work`** or **`math_detail=final_only`** in **`prompt_signals:`**; reply should include or omit intermediate math steps accordingly |
243
+ | Code fences vs inline | In a **long** message that includes **code / commands / scripts**, ask for **fenced code blocks**, **markdown code fences**, **triple-backtick fences**, etc. **Or** say **inline code only**, **no triple backticks**, **no fenced code blocks**, **keep snippets inline** (avoid mixing both in one line). | **`code_block_style=fenced`** or **`code_block_style=inline`** in **`prompt_signals:`**; reply should use ``` fences or inline backticks accordingly |
244
+
245
+ If there is no footer, brain trace is off for that session, or this deployment has **no** encoder / FAQ / memory / web layers and no prompt signals fired yet—**prompt signals alone** still turn the footer on once this feature triggers.
246
+
247
+ ---
248
+
249
  ### What to try (step-by-step)
250
 
251
  | Goal | What to type |
 
310
  - Ensure **smart routing** is on (say *Turn on smart routing* if you turned it off).
311
  - Ask in plain language for **live web** / **Google** / **today’s** information, e.g. *Search the web for the latest SpaceX launch summary* or *What does the web say about …?*
312
  - The router uses intent **`web_search`**: the app fetches snippets, injects them into the model context, then the assistant replies **using those sources** (cite **[Web n]** when using a snippet).
313
+ - **Automatic web:** if Google CSE is configured, the app may also run a web search when your message **implies** fresh public facts (e.g. *latest*, *today*, *who won*, *stock price*, a recent year + question) even if you do not say “search the web”. On a self-hosted Space you can disable that with **`--no-auto-web`** or env **`NO_AUTO_WEB=1`**. Brain trace may show **`+auto`** on the web line when the upgrade came from this layer rather than the router alone.
314
  - If the model stays in FAQ-only mode, use **`/web …`** first to confirm the API works, then try clearer web phrasing.
315
 
316
  **6) Brain trace**
 
391
  - Default to "chat" when unsure; copy the entire user message into "text".
392
  - Do not invent facts for "grounded": if no clear facts/context, use "chat" instead.
393
  - Use **retrieve** for bundled FAQ / help-base search; use **web_search** when the user clearly needs the **public web** (today, external site, breaking news, "google this", etc.).
394
+ - **web_search vs chat (critical):** choose **web_search** when a good answer depends on **recent events**, **live or site-specific data** (prices, sports scores, releases after your knowledge cutoff, "what happened today", laws/regulations that change), **verifying a claim against the public web**, or **finding an official URL**. Choose **chat** for timeless explanations, coding how-to without needing today's docs, brainstorming, role-play, or personal opinion where web snippets would not change the answer.
395
  - Extract minimal "text" for tool intents (do not repeat system chatter)."""
396
 
397
  VALID_INTENTS = frozenset(
 
1288
  lines.append("Prefer fuller, well-structured explanations when they help the user.")
1289
  if rformat == "bullets":
1290
  lines.append("When listing multiple points, use markdown bullet or numbered lists.")
1291
+ elif rformat == "prose":
1292
+ lines.append(
1293
+ "Prefer continuous paragraphs over bullet lists unless a very short list is clearer."
1294
+ )
1295
  audience = str(session.get("audience") or "normal").lower()
1296
  if audience not in ("simple", "normal", "technical"):
1297
  audience = "normal"
 
1841
  action="store_true",
1842
  help="Disable NL intent routing (plain chat only; slash commands still work).",
1843
  )
1844
+ p.add_argument(
1845
+ "--no-auto-web",
1846
+ action="store_true",
1847
+ help="Disable chat→web_search heuristic (only explicit router web_search or /web uses Google CSE).",
1848
+ )
1849
  p.add_argument(
1850
  "--router-max-new-tokens",
1851
  type=int,
 
1859
  def main() -> None:
1860
  args = parse_args()
1861
  _load_dotenv_if_present(_REPO)
1862
+ if os.environ.get("NO_AUTO_WEB", "").strip().lower() in ("1", "true", "yes", "on"):
1863
+ args.no_auto_web = True
1864
  _gk, _gc, _, _ = read_google_cse_settings()
1865
  cse_on = bool(_gk and _gc)
1866
  _ensure_gradio_can_reach_localhost()
 
2043
  except Exception:
2044
  route = {"intent": "chat", "text": msg, "question": "", "context": ""}
2045
 
2046
+ g_key, g_cx, _, _ = read_google_cse_settings()
2047
+ web_from_auto = False
2048
+ if (
2049
+ not args.no_auto_web
2050
+ and route["intent"] == "chat"
2051
+ and g_key
2052
+ and g_cx
2053
+ and heuristic_suggests_web_search(msg)
2054
+ ):
2055
+ route = {
2056
+ "intent": "web_search",
2057
+ "text": msg,
2058
+ "question": "",
2059
+ "context": "",
2060
+ }
2061
+ web_from_auto = True
2062
+
2063
  if route["intent"] == "web_search":
2064
  g_key, g_cx, g_num, g_safe = read_google_cse_settings()
2065
  q_web = (route["text"] or msg).strip()
2066
+ _as = "+auto" if web_from_auto else ""
2067
+ web_trace = f"web:CSE:cfg{_as}"
2068
  if g_key and g_cx and q_web:
2069
  try:
2070
  hits = google_cse_search(
 
2075
  safe=g_safe,
2076
  )
2077
  web_block = format_cse_hits_markdown(hits, for_chat=True)
2078
+ web_trace = f"web:CSE:{len(hits)}{_as}"
2079
  except Exception as ex:
2080
  web_block = (
2081
  f"(Google web search failed: {_clip(str(ex), 500)})\n\n"
2082
  "Answer from general knowledge where appropriate; do not invent URLs or page titles."
2083
  )
2084
+ web_trace = f"web:CSE:err{_as}"
2085
  elif not q_web:
2086
  web_block = "(Empty web search query. Ask again with a concrete search topic.)"
2087
+ web_trace = f"web:CSE:empty{_as}"
2088
  else:
2089
  web_block = (
2090
  "(Web search is not configured: set **GOOGLE_CSE_API_KEY** and **GOOGLE_CSE_CX** "
 
2117
 
2118
  chat_line = route["text"] or msg
2119
 
2120
+ sig_overrides, sig_extras, sig_trace_tags = analyze_embedded_prompt_signals(msg)
2121
+ eff_session = dict(ub_session)
2122
+ eff_session.update(sig_overrides)
2123
  trace: list[str] = []
2124
+ prompt_sig_active = bool(sig_overrides or sig_extras or sig_trace_tags)
2125
+ if prompt_sig_active:
2126
+ bits = [f"{k}={v}" for k, v in sorted(sig_overrides.items())]
2127
+ bits.extend(sig_trace_tags)
2128
+ trace.append("prompt_signals:" + "+".join(bits))
2129
  extras: list[str] = []
2130
+ _append_reply_style_hints(extras, eff_session)
2131
+ for para in sig_extras:
2132
+ extras.append(para)
2133
  if web_trace:
2134
  trace.append(web_trace)
2135
 
 
2199
  or mem_conn is not None
2200
  or effective_rag is not None
2201
  or bool(web_trace)
2202
+ or prompt_sig_active
2203
  )
2204
  )
2205
  if show_trace_footer and trace: