Spaces:
Running
A newer version of the Gradio SDK is available: 6.10.0
Stage 1 - Query Rewriting Contract
Stage 1 rewrites free-form prompt text into short, retrieval-friendly phrases. It is not canonical tag selection and not final validation.
Primary implementation: psq_rag/llm/rewrite.py::llm_rewrite_prompt.
Purpose
- Convert free text into concise comma-separated phrase queries.
- Favor retrieval recall and tag-like phrasing.
- Avoid irreversible decisions (final filtering happens later).
Inputs
prompt_in: str
The prompt can be raw natural language, comma-style prompts, or mixed text.
LLM Call Behavior
Stage 1 uses one deterministic OpenRouter call:
- system prompt:
REWRITE_SYSTEMinrewrite.py - user content: raw
prompt_in - temperature:
0.0 - max tokens:
256 - response format: none (plain text)
- no retries in
llm_rewrite_prompt
Model/auth endpoint behavior comes from openrouter_client.py:
- OpenRouter endpoint:
/chat/completions - API key from
OPENROUTER_API_KEY - model from
OPENROUTER_MODEL(defaultmistralai/mistral-small-24b-instruct-2501).
Output Rules
On successful completion:
- read raw text response
- trim
- collapse whitespace
- truncate to 800 chars
- return rewritten string (comma-separated phrase style expected by prompt)
No structured parsing or vocabulary grounding is done in Stage 1.
Failure Behavior
llm_rewrite_prompt returns empty string when:
- OpenRouter call errors
- refusal-like or filtered response is surfaced as error by client
- response is empty
It logs:
LLM rewrite: fallback (error: ...)on error- refusal text preview when available
LLM rewrite: fallback (empty response)on empty output
Non-LLM Heuristic Companion (Pipeline Context)
Outside Stage 1 itself, app.py also computes heuristic short phrases via:
extract_user_provided_tags_upto_3_words()- split on
.and, - keep segments with <= 3 tokens
- case-insensitive dedupe
These heuristic terms are later appended to retrieval input only if rewrite succeeds.
App-Level Contract (Important)
In current app orchestration:
- rewrite, structural, and probe run concurrently
- rewrite timeout is enforced (
PSQ_TIMEOUT_REWRITE_S, default 45s) - rewrite is strict:
- if rewrite fails or is empty, app raises
RuntimeError("Rewrite: empty output") - pipeline does not continue to retrieval/selection with an empty rewrite
- if rewrite fails or is empty, app raises
So while Stage 1 function returns "" on failure, app-level behavior treats empty rewrite as a hard error for this request.
Stage Boundary with Stage 2
Stage 1 guarantees only:
- string output intended as comma-separated retrieval phrases
- deterministic call settings (temperature 0, single pass)
Stage 2 is responsible for:
- normalization
- deduplication
- alias/canonical grounding
- scoring
- candidate ranking and truncation.