Spaces:
Sleeping
Sleeping
Gradio Micro-Trend Detector — Build Plan
- Use the provided prompt verbatim: The system prompt in
docs/problem-statement.mdmust be used as-is for all providers (OpenAI + Gemini). Only attach a minimal user prompt per request. - Reuse the settings shape: Follow the
sample_code/settings.jsonstructure for all configurable keys (API keys, model names, reasoning effort, project/location flags). - Reference samples: Mirror integration patterns shown in
sample_code/llm_client.py(OpenAI responses API) and any other helpers insample_code/for payloads, retries, and settings resolution.
Delivery Steps
Requirements & schema
- Extract the output JSON contract from
docs/problem-statement.mdand codify it (Pydantic/TypedDict) for validation and downstream parsing. - Decide on the response envelope:
{ "trends": <validated JSON>, "summary": <bullet list> }.
- Extract the output JSON contract from
Configuration layer
- Implement a
settingsloader that readssettings.json(and env overrides) using the same keys assample_code/settings.json(OPENAI_API_KEY,GEMINI_API_KEY,OPENAI_MODEL,OPENAI_REASONING_EFFORT,GOOGLE_GENAI_USE_VERTEXAI,GOOGLE_CLOUD_PROJECT,GOOGLE_CLOUD_LOCATION). - Provide
.env.exampleand document required vars inREADME.
- Implement a
Model abstraction
- Create a unified
llm_clients.pywithanalyze(images: list[bytes], model: str) -> dict. - Providers: OpenAI GPT-5 and GPT-5 mini via the Responses API; Gemini 3 vision endpoint with safety params aligned to the sample.
- Shared concerns: timeouts, retries/backoff, logging, optional temperature/max_tokens, deterministic defaults.
- Create a unified
Prompting strategy
- System prompt = the exact content from
docs/problem-statement.md(no edits). - User prompt per call: short instruction to analyze the attached garment image(s) and emit only the specified JSON.
- Enforce “JSON first” responses; consider a post-parse repair/reprompt path if JSON is invalid.
- System prompt = the exact content from
Inference pipeline
- Image intake: validate file types, normalize to RGB, optional downscale/compress for cost and latency.
- Call model abstraction; parse and validate JSON against the schema; if invalid, attempt regex extract or auto-reprompt with the model including the error.
- Derive the bullet-point summary from validated JSON (or accept model-provided summary if valid).
Gradio UI
- Inputs:
gr.Files(multiple images), model dropdown (GPT-5,GPT-5-mini,Gemini 3), creativity/temperature slider, optional checkbox for “downscale images”. - Outputs:
gr.JSONfor the structured trends,gr.Markdownfor bullet summary; error banner for validation issues; loading indicator/queue enabled. - Add helper text describing acceptable formats and latency expectations; optional “Download JSON” button.
- Inputs:
Observability & performance
- Log per-request latency, model used, image count/size, and validation outcomes.
- Default to GPT-5 mini to control cost; allow overrides via settings or UI.
- Optional image downscaling knob; consider concurrency limits via Gradio queue.
Packaging & run
- Add
requirements.txt/pyprojectentries (gradio, openai>=1.x, google-genai/vertex client, pydantic, pillow). - Document
python app.py --settings settings.json(or env-only) startup, including PORT/HOST env handling for deployment.
- Add
Acceptance checklist
- Gradio UI renders, accepts multiple images, selects among the three models, and returns validated JSON + bullet summary.
- Prompt from
docs/problem-statement.mdis used unchanged. - Settings follow the
sample_code/settings.jsonshape; README and.env.examplesupplied.