Spaces:
Running
Running
File size: 2,844 Bytes
57b7339 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 | # Stage 1 - Query Rewriting Contract
Stage 1 rewrites free-form prompt text into short, retrieval-friendly phrases.
It is not canonical tag selection and not final validation.
Primary implementation: `psq_rag/llm/rewrite.py::llm_rewrite_prompt`.
---
## Purpose
- Convert free text into concise comma-separated phrase queries.
- Favor retrieval recall and tag-like phrasing.
- Avoid irreversible decisions (final filtering happens later).
---
## Inputs
- `prompt_in: str`
The prompt can be raw natural language, comma-style prompts, or mixed text.
---
## LLM Call Behavior
Stage 1 uses one deterministic OpenRouter call:
- system prompt: `REWRITE_SYSTEM` in `rewrite.py`
- user content: raw `prompt_in`
- temperature: `0.0`
- max tokens: `256`
- response format: none (plain text)
- no retries in `llm_rewrite_prompt`
Model/auth endpoint behavior comes from `openrouter_client.py`:
- OpenRouter endpoint: `/chat/completions`
- API key from `OPENROUTER_API_KEY`
- model from `OPENROUTER_MODEL` (default `mistralai/mistral-small-24b-instruct-2501`).
---
## Output Rules
On successful completion:
- read raw text response
- trim
- collapse whitespace
- truncate to 800 chars
- return rewritten string (comma-separated phrase style expected by prompt)
No structured parsing or vocabulary grounding is done in Stage 1.
---
## Failure Behavior
`llm_rewrite_prompt` returns empty string when:
- OpenRouter call errors
- refusal-like or filtered response is surfaced as error by client
- response is empty
It logs:
- `LLM rewrite: fallback (error: ...)` on error
- refusal text preview when available
- `LLM rewrite: fallback (empty response)` on empty output
---
## Non-LLM Heuristic Companion (Pipeline Context)
Outside Stage 1 itself, `app.py` also computes heuristic short phrases via:
- `extract_user_provided_tags_upto_3_words()`
- split on `.` and `,`
- keep segments with <= 3 tokens
- case-insensitive dedupe
These heuristic terms are later appended to retrieval input only if rewrite succeeds.
---
## App-Level Contract (Important)
In current app orchestration:
- rewrite, structural, and probe run concurrently
- rewrite timeout is enforced (`PSQ_TIMEOUT_REWRITE_S`, default 45s)
- rewrite is **strict**:
- if rewrite fails or is empty, app raises `RuntimeError("Rewrite: empty output")`
- pipeline does not continue to retrieval/selection with an empty rewrite
So while Stage 1 function returns `""` on failure, app-level behavior treats empty rewrite as a hard error for this request.
---
## Stage Boundary with Stage 2
Stage 1 guarantees only:
- string output intended as comma-separated retrieval phrases
- deterministic call settings (temperature 0, single pass)
Stage 2 is responsible for:
- normalization
- deduplication
- alias/canonical grounding
- scoring
- candidate ranking and truncation.
|