image-understanding / README.md
shahkushan1's picture
Added a toggle to hide Gemini in the UI:
f2f28df

A newer version of the Gradio SDK is available: 6.2.0

Upgrade
metadata
title: Micro Trend Research
emoji: 👗
colorFrom: red
colorTo: indigo
sdk: gradio
sdk_version: 5.23.2
app_file: app.py
pinned: false

Micro-Trend Detection Gradio UI

Gradio app that ingests garment images, calls GPT-5/GPT-5 mini or Gemini 3 vision models, and returns the micro-trend JSON plus a bullet summary.

Setup

  • Python 3.11+ recommended.
  • Install deps: pip install -r requirements.txt
  • Configure secrets via environment or settings.json (keys mirror sample_code/settings.json) or .env:
    • OPENAI_API_KEY, GEMINI_API_KEY
    • OPENAI_MODEL (default gpt-5-mini), OPENAI_REASONING_EFFORT
    • GOOGLE_GENAI_USE_VERTEXAI, GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_LOCATION
  • The system prompt lives in prompts/micro-trend-prompt.md and is loaded automatically.

Run

python app.py

Use PORT/HOST env vars if you need custom binding (Gradio honors them).

How it works

  • app.py builds the Gradio UI (multi-image upload, model dropdown, optional downscale).
  • pipeline.py calls the unified LLM client, extracts/validates the JSON, and derives summary bullets.
  • llm_clients.py wraps OpenAI Responses API and Gemini 3 vision.
  • schemas.py provides structural validation and summary helper.
  • settings.py loads config with env overrides.
  • Prompt is read from prompts/micro-trend-prompt.md unchanged.
  • Set ENABLE_GEMINI=false to hide Gemini from the model dropdown (useful on Spaces if Gemini isn’t available).

Gemini auth notes

  • Two modes:
    • Vertex (default): set GOOGLE_GENAI_USE_VERTEXAI=true and ensure ADC is available (e.g., gcloud auth application-default login) plus GOOGLE_CLOUD_PROJECT/GOOGLE_CLOUD_LOCATION. Confirm the chosen model exists in your Vertex region.
    • API key (HuggingFace / Studio): set GOOGLE_GENAI_USE_VERTEXAI=false and provide GEMINI_API_KEY.
    • Default Gemini model name is gemini-3-pro-preview (multimodal text-out). Adjust to a region-available model if needed.

Notes

  • Testing is deferred for now; add unit tests later for schema validation and summary builder.
  • Downscale checkbox reduces images to 1024px for lower cost/latency. If downscale fails, original bytes are used.