Spaces:
Sleeping
Sleeping
| title: LLM Compare | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: "6.9.0" | |
| app_file: app.py | |
| pinned: false | |
| # LLM Compare | |
| A Gradio web app for side-by-side LLM comparison. Compare your Dify application against reference models from OpenAI, Anthropic, Google Gemini, Qwen, and Yi. | |
| ## Features | |
| - **Two-column layout**: Your Dify app on the left, a selectable reference model on the right | |
| - **Multiple providers**: OpenAI (GPT-4o), Anthropic (Claude), Google Gemini, Qwen, Yi | |
| - **Overridable defaults**: Base URL and Model ID auto-fill from env vars but can be edited per-session | |
| - **Evaluation workflow**: Comment and grade (1β10) each model's response | |
| - **Nickname tracking**: All evaluations tagged with user nickname | |
| - **Excel export**: Download all evaluation data as `.xlsx` | |
| ## Setup | |
| ```bash | |
| pip install -r requirements.txt | |
| python app.py | |
| ``` | |
| ## Environment Variables | |
| Set these as **Hugging Face Spaces secrets** (Settings β Repository secrets) to provide defaults. | |
| Users can override Base URL / Model ID in the UI at runtime. | |
| ### API Keys (required for each provider you use) | |
| | Variable | Provider | | |
| |---|---| | |
| | `OPENAI_API_KEY` | OpenAI | | |
| | `ANTHROPIC_API_KEY` | Anthropic | | |
| | `GOOGLE_API_KEY` | Google Gemini | | |
| | `DASHSCOPE_API_KEY` | Qwen (DashScope / Alibaba) | | |
| | `YI_API_KEY` | Yi (01.AI) | | |
| ### Base URL overrides (optional) | |
| Override the default API endpoint for each provider. Useful for proxies or custom deployments. | |
| | Variable | Default | | |
| |---|---| | |
| | `OPENAI_BASE_URL` | *(uses OpenAI SDK default)* | | |
| | `ANTHROPIC_BASE_URL` | *(uses Anthropic SDK default)* | | |
| | `GOOGLE_BASE_URL` | *(uses Google GenAI SDK default)* | | |
| | `DASHSCOPE_BASE_URL` | `https://dashscope.aliyuncs.com/compatible-mode/v1` | | |
| | `YI_BASE_URL` | `https://api.01.ai/v1` | | |
| ### Model ID overrides (optional) | |
| Override the default model ID. Useful for switching to newer model versions without code changes. | |
| | Variable | Default | | |
| |---|---| | |
| | `OPENAI_MODEL_ID` | `gpt-4o` | | |
| | `OPENAI_MINI_MODEL_ID` | `gpt-4o-mini` | | |
| | `ANTHROPIC_MODEL_ID` | `claude-sonnet-4-20250514` | | |
| | `GOOGLE_MODEL_ID` | `gemini-2.0-flash` | | |
| | `DASHSCOPE_MODEL_ID` | `qwen-plus` | | |
| | `YI_MODEL_ID` | `yi-large` | | |
| ## How it works | |
| 1. Select a reference model from the dropdown β **Base URL** and **Model ID** auto-fill from env vars (or registry defaults) | |
| 2. Edit Base URL / Model ID if needed (changes apply to current session only) | |
| 3. Enter your prompt and click **Send to Both** | |
| 4. Grade and comment on each response, then **Submit Evaluation** | |
| ## Deployment | |
| Deploy on HuggingFace Spaces with Gradio SDK. Set the API keys and optional overrides as repository secrets in Settings. | |