image-understanding / README.md
shahkushan1's picture
Added a toggle to hide Gemini in the UI:
f2f28df
---
title: Micro Trend Research
emoji: "👗"
colorFrom: red
colorTo: indigo
sdk: gradio
sdk_version: "5.23.2"
app_file: app.py
pinned: false
---
# Micro-Trend Detection Gradio UI
Gradio app that ingests garment images, calls GPT-5/GPT-5 mini or Gemini 3 vision models, and returns the micro-trend JSON plus a bullet summary.
## Setup
- Python 3.11+ recommended.
- Install deps: `pip install -r requirements.txt`
- Configure secrets via environment or `settings.json` (keys mirror `sample_code/settings.json`) or `.env`:
- `OPENAI_API_KEY`, `GEMINI_API_KEY`
- `OPENAI_MODEL` (default `gpt-5-mini`), `OPENAI_REASONING_EFFORT`
- `GOOGLE_GENAI_USE_VERTEXAI`, `GOOGLE_CLOUD_PROJECT`, `GOOGLE_CLOUD_LOCATION`
- The system prompt lives in `prompts/micro-trend-prompt.md` and is loaded automatically.
## Run
```bash
python app.py
```
Use `PORT`/`HOST` env vars if you need custom binding (Gradio honors them).
## How it works
- `app.py` builds the Gradio UI (multi-image upload, model dropdown, optional downscale).
- `pipeline.py` calls the unified LLM client, extracts/validates the JSON, and derives summary bullets.
- `llm_clients.py` wraps OpenAI Responses API and Gemini 3 vision.
- `schemas.py` provides structural validation and summary helper.
- `settings.py` loads config with env overrides.
- Prompt is read from `prompts/micro-trend-prompt.md` unchanged.
- Set `ENABLE_GEMINI=false` to hide Gemini from the model dropdown (useful on Spaces if Gemini isn’t available).
## Gemini auth notes
- Two modes:
- Vertex (default): set `GOOGLE_GENAI_USE_VERTEXAI=true` and ensure ADC is available (e.g., `gcloud auth application-default login`) plus `GOOGLE_CLOUD_PROJECT`/`GOOGLE_CLOUD_LOCATION`. Confirm the chosen model exists in your Vertex region.
- API key (HuggingFace / Studio): set `GOOGLE_GENAI_USE_VERTEXAI=false` and provide `GEMINI_API_KEY`.
- Default Gemini model name is `gemini-3-pro-preview` (multimodal text-out). Adjust to a region-available model if needed.
## Notes
- Testing is deferred for now; add unit tests later for schema validation and summary builder.
- Downscale checkbox reduces images to 1024px for lower cost/latency. If downscale fails, original bytes are used.