Prompt_Squirrel_RAG / docs /rewrite_contract.md
Food Desert
Refresh pipeline contract docs to match current runtime behavior
57b7339

A newer version of the Gradio SDK is available: 6.10.0

Upgrade

Stage 1 - Query Rewriting Contract

Stage 1 rewrites free-form prompt text into short, retrieval-friendly phrases. It is not canonical tag selection and not final validation.

Primary implementation: psq_rag/llm/rewrite.py::llm_rewrite_prompt.


Purpose

  • Convert free text into concise comma-separated phrase queries.
  • Favor retrieval recall and tag-like phrasing.
  • Avoid irreversible decisions (final filtering happens later).

Inputs

  • prompt_in: str

The prompt can be raw natural language, comma-style prompts, or mixed text.


LLM Call Behavior

Stage 1 uses one deterministic OpenRouter call:

  • system prompt: REWRITE_SYSTEM in rewrite.py
  • user content: raw prompt_in
  • temperature: 0.0
  • max tokens: 256
  • response format: none (plain text)
  • no retries in llm_rewrite_prompt

Model/auth endpoint behavior comes from openrouter_client.py:

  • OpenRouter endpoint: /chat/completions
  • API key from OPENROUTER_API_KEY
  • model from OPENROUTER_MODEL (default mistralai/mistral-small-24b-instruct-2501).

Output Rules

On successful completion:

  • read raw text response
  • trim
  • collapse whitespace
  • truncate to 800 chars
  • return rewritten string (comma-separated phrase style expected by prompt)

No structured parsing or vocabulary grounding is done in Stage 1.


Failure Behavior

llm_rewrite_prompt returns empty string when:

  • OpenRouter call errors
  • refusal-like or filtered response is surfaced as error by client
  • response is empty

It logs:

  • LLM rewrite: fallback (error: ...) on error
  • refusal text preview when available
  • LLM rewrite: fallback (empty response) on empty output

Non-LLM Heuristic Companion (Pipeline Context)

Outside Stage 1 itself, app.py also computes heuristic short phrases via:

  • extract_user_provided_tags_upto_3_words()
  • split on . and ,
  • keep segments with <= 3 tokens
  • case-insensitive dedupe

These heuristic terms are later appended to retrieval input only if rewrite succeeds.


App-Level Contract (Important)

In current app orchestration:

  • rewrite, structural, and probe run concurrently
  • rewrite timeout is enforced (PSQ_TIMEOUT_REWRITE_S, default 45s)
  • rewrite is strict:
    • if rewrite fails or is empty, app raises RuntimeError("Rewrite: empty output")
    • pipeline does not continue to retrieval/selection with an empty rewrite

So while Stage 1 function returns "" on failure, app-level behavior treats empty rewrite as a hard error for this request.


Stage Boundary with Stage 2

Stage 1 guarantees only:

  • string output intended as comma-separated retrieval phrases
  • deterministic call settings (temperature 0, single pass)

Stage 2 is responsible for:

  • normalization
  • deduplication
  • alias/canonical grounding
  • scoring
  • candidate ranking and truncation.