Spaces:
Running
Running
| # Gradio Micro-Trend Detector — Build Plan | |
| - **Use the provided prompt verbatim**: The system prompt in `docs/problem-statement.md` must be used as-is for all providers (OpenAI + Gemini). Only attach a minimal user prompt per request. | |
| - **Reuse the settings shape**: Follow the `sample_code/settings.json` structure for all configurable keys (API keys, model names, reasoning effort, project/location flags). | |
| - **Reference samples**: Mirror integration patterns shown in `sample_code/llm_client.py` (OpenAI responses API) and any other helpers in `sample_code/` for payloads, retries, and settings resolution. | |
| ## Delivery Steps | |
| 1) **Requirements & schema** | |
| - Extract the output JSON contract from `docs/problem-statement.md` and codify it (Pydantic/TypedDict) for validation and downstream parsing. | |
| - Decide on the response envelope: `{ "trends": <validated JSON>, "summary": <bullet list> }`. | |
| 2) **Configuration layer** | |
| - Implement a `settings` loader that reads `settings.json` (and env overrides) using the same keys as `sample_code/settings.json` (`OPENAI_API_KEY`, `GEMINI_API_KEY`, `OPENAI_MODEL`, `OPENAI_REASONING_EFFORT`, `GOOGLE_GENAI_USE_VERTEXAI`, `GOOGLE_CLOUD_PROJECT`, `GOOGLE_CLOUD_LOCATION`). | |
| - Provide `.env.example` and document required vars in `README`. | |
| 3) **Model abstraction** | |
| - Create a unified `llm_clients.py` with `analyze(images: list[bytes], model: str) -> dict`. | |
| - Providers: OpenAI GPT-5 and GPT-5 mini via the Responses API; Gemini 3 vision endpoint with safety params aligned to the sample. | |
| - Shared concerns: timeouts, retries/backoff, logging, optional temperature/max_tokens, deterministic defaults. | |
| 4) **Prompting strategy** | |
| - System prompt = the exact content from `docs/problem-statement.md` (no edits). | |
| - User prompt per call: short instruction to analyze the attached garment image(s) and emit only the specified JSON. | |
| - Enforce “JSON first” responses; consider a post-parse repair/reprompt path if JSON is invalid. | |
| 5) **Inference pipeline** | |
| - Image intake: validate file types, normalize to RGB, optional downscale/compress for cost and latency. | |
| - Call model abstraction; parse and validate JSON against the schema; if invalid, attempt regex extract or auto-reprompt with the model including the error. | |
| - Derive the bullet-point summary from validated JSON (or accept model-provided summary if valid). | |
| 6) **Gradio UI** | |
| - Inputs: `gr.Files` (multiple images), model dropdown (`GPT-5`, `GPT-5-mini`, `Gemini 3`), creativity/temperature slider, optional checkbox for “downscale images”. | |
| - Outputs: `gr.JSON` for the structured trends, `gr.Markdown` for bullet summary; error banner for validation issues; loading indicator/queue enabled. | |
| - Add helper text describing acceptable formats and latency expectations; optional “Download JSON” button. | |
| 7) **Observability & performance** | |
| - Log per-request latency, model used, image count/size, and validation outcomes. | |
| - Default to GPT-5 mini to control cost; allow overrides via settings or UI. | |
| - Optional image downscaling knob; consider concurrency limits via Gradio queue. | |
| 8) **Packaging & run** | |
| - Add `requirements.txt/pyproject` entries (gradio, openai>=1.x, google-genai/vertex client, pydantic, pillow). | |
| - Document `python app.py --settings settings.json` (or env-only) startup, including PORT/HOST env handling for deployment. | |
| 9) **Acceptance checklist** | |
| - Gradio UI renders, accepts multiple images, selects among the three models, and returns validated JSON + bullet summary. | |
| - Prompt from `docs/problem-statement.md` is used unchanged. | |
| - Settings follow the `sample_code/settings.json` shape; README and `.env.example` supplied. | |