shahkushan1's picture
Add Gradio micro-trend app with LLM integrations and prompt loading
2948ced
|
raw
history blame
3.72 kB

Gradio Micro-Trend Detector — Build Plan

  • Use the provided prompt verbatim: The system prompt in docs/problem-statement.md must be used as-is for all providers (OpenAI + Gemini). Only attach a minimal user prompt per request.
  • Reuse the settings shape: Follow the sample_code/settings.json structure for all configurable keys (API keys, model names, reasoning effort, project/location flags).
  • Reference samples: Mirror integration patterns shown in sample_code/llm_client.py (OpenAI responses API) and any other helpers in sample_code/ for payloads, retries, and settings resolution.

Delivery Steps

  1. Requirements & schema

    • Extract the output JSON contract from docs/problem-statement.md and codify it (Pydantic/TypedDict) for validation and downstream parsing.
    • Decide on the response envelope: { "trends": <validated JSON>, "summary": <bullet list> }.
  2. Configuration layer

    • Implement a settings loader that reads settings.json (and env overrides) using the same keys as sample_code/settings.json (OPENAI_API_KEY, GEMINI_API_KEY, OPENAI_MODEL, OPENAI_REASONING_EFFORT, GOOGLE_GENAI_USE_VERTEXAI, GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_LOCATION).
    • Provide .env.example and document required vars in README.
  3. Model abstraction

    • Create a unified llm_clients.py with analyze(images: list[bytes], model: str) -> dict.
    • Providers: OpenAI GPT-5 and GPT-5 mini via the Responses API; Gemini 3 vision endpoint with safety params aligned to the sample.
    • Shared concerns: timeouts, retries/backoff, logging, optional temperature/max_tokens, deterministic defaults.
  4. Prompting strategy

    • System prompt = the exact content from docs/problem-statement.md (no edits).
    • User prompt per call: short instruction to analyze the attached garment image(s) and emit only the specified JSON.
    • Enforce “JSON first” responses; consider a post-parse repair/reprompt path if JSON is invalid.
  5. Inference pipeline

    • Image intake: validate file types, normalize to RGB, optional downscale/compress for cost and latency.
    • Call model abstraction; parse and validate JSON against the schema; if invalid, attempt regex extract or auto-reprompt with the model including the error.
    • Derive the bullet-point summary from validated JSON (or accept model-provided summary if valid).
  6. Gradio UI

    • Inputs: gr.Files (multiple images), model dropdown (GPT-5, GPT-5-mini, Gemini 3), creativity/temperature slider, optional checkbox for “downscale images”.
    • Outputs: gr.JSON for the structured trends, gr.Markdown for bullet summary; error banner for validation issues; loading indicator/queue enabled.
    • Add helper text describing acceptable formats and latency expectations; optional “Download JSON” button.
  7. Observability & performance

    • Log per-request latency, model used, image count/size, and validation outcomes.
    • Default to GPT-5 mini to control cost; allow overrides via settings or UI.
    • Optional image downscaling knob; consider concurrency limits via Gradio queue.
  8. Packaging & run

    • Add requirements.txt/pyproject entries (gradio, openai>=1.x, google-genai/vertex client, pydantic, pillow).
    • Document python app.py --settings settings.json (or env-only) startup, including PORT/HOST env handling for deployment.
  9. Acceptance checklist

    • Gradio UI renders, accepts multiple images, selects among the three models, and returns validated JSON + bullet summary.
    • Prompt from docs/problem-statement.md is used unchanged.
    • Settings follow the sample_code/settings.json shape; README and .env.example supplied.