Spaces:
Running
Running
| # Stage 1 - Query Rewriting Contract | |
| Stage 1 rewrites free-form prompt text into short, retrieval-friendly phrases. | |
| It is not canonical tag selection and not final validation. | |
| Primary implementation: `psq_rag/llm/rewrite.py::llm_rewrite_prompt`. | |
| --- | |
| ## Purpose | |
| - Convert free text into concise comma-separated phrase queries. | |
| - Favor retrieval recall and tag-like phrasing. | |
| - Avoid irreversible decisions (final filtering happens later). | |
| --- | |
| ## Inputs | |
| - `prompt_in: str` | |
| The prompt can be raw natural language, comma-style prompts, or mixed text. | |
| --- | |
| ## LLM Call Behavior | |
| Stage 1 uses one deterministic OpenRouter call: | |
| - system prompt: `REWRITE_SYSTEM` in `rewrite.py` | |
| - user content: raw `prompt_in` | |
| - temperature: `0.0` | |
| - max tokens: `256` | |
| - response format: none (plain text) | |
| - no retries in `llm_rewrite_prompt` | |
| Model/auth endpoint behavior comes from `openrouter_client.py`: | |
| - OpenRouter endpoint: `/chat/completions` | |
| - API key from `OPENROUTER_API_KEY` | |
| - model from `OPENROUTER_MODEL` (default `mistralai/mistral-small-24b-instruct-2501`). | |
| --- | |
| ## Output Rules | |
| On successful completion: | |
| - read raw text response | |
| - trim | |
| - collapse whitespace | |
| - truncate to 800 chars | |
| - return rewritten string (comma-separated phrase style expected by prompt) | |
| No structured parsing or vocabulary grounding is done in Stage 1. | |
| --- | |
| ## Failure Behavior | |
| `llm_rewrite_prompt` returns empty string when: | |
| - OpenRouter call errors | |
| - refusal-like or filtered response is surfaced as error by client | |
| - response is empty | |
| It logs: | |
| - `LLM rewrite: fallback (error: ...)` on error | |
| - refusal text preview when available | |
| - `LLM rewrite: fallback (empty response)` on empty output | |
| --- | |
| ## Non-LLM Heuristic Companion (Pipeline Context) | |
| Outside Stage 1 itself, `app.py` also computes heuristic short phrases via: | |
| - `extract_user_provided_tags_upto_3_words()` | |
| - split on `.` and `,` | |
| - keep segments with <= 3 tokens | |
| - case-insensitive dedupe | |
| These heuristic terms are later appended to retrieval input only if rewrite succeeds. | |
| --- | |
| ## App-Level Contract (Important) | |
| In current app orchestration: | |
| - rewrite, structural, and probe run concurrently | |
| - rewrite timeout is enforced (`PSQ_TIMEOUT_REWRITE_S`, default 45s) | |
| - rewrite is **strict**: | |
| - if rewrite fails or is empty, app raises `RuntimeError("Rewrite: empty output")` | |
| - pipeline does not continue to retrieval/selection with an empty rewrite | |
| So while Stage 1 function returns `""` on failure, app-level behavior treats empty rewrite as a hard error for this request. | |
| --- | |
| ## Stage Boundary with Stage 2 | |
| Stage 1 guarantees only: | |
| - string output intended as comma-separated retrieval phrases | |
| - deterministic call settings (temperature 0, single pass) | |
| Stage 2 is responsible for: | |
| - normalization | |
| - deduplication | |
| - alias/canonical grounding | |
| - scoring | |
| - candidate ranking and truncation. | |