Spaces:
Sleeping
Sleeping
A newer version of the Gradio SDK is available: 6.15.2
metadata
title: LLM Compare
emoji: π
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 6.9.0
app_file: app.py
pinned: false
LLM Compare
A Gradio web app for side-by-side LLM comparison. Compare your Dify application against reference models from OpenAI, Anthropic, Google Gemini, Qwen, and Yi.
Features
- Two-column layout: Your Dify app on the left, a selectable reference model on the right
- Multiple providers: OpenAI (GPT-4o), Anthropic (Claude), Google Gemini, Qwen, Yi
- Overridable defaults: Base URL and Model ID auto-fill from env vars but can be edited per-session
- Evaluation workflow: Comment and grade (1β10) each model's response
- Nickname tracking: All evaluations tagged with user nickname
- Excel export: Download all evaluation data as
.xlsx
Setup
pip install -r requirements.txt
python app.py
Environment Variables
Set these as Hugging Face Spaces secrets (Settings β Repository secrets) to provide defaults. Users can override Base URL / Model ID in the UI at runtime.
API Keys (required for each provider you use)
| Variable | Provider |
|---|---|
OPENAI_API_KEY |
OpenAI |
ANTHROPIC_API_KEY |
Anthropic |
GOOGLE_API_KEY |
Google Gemini |
DASHSCOPE_API_KEY |
Qwen (DashScope / Alibaba) |
YI_API_KEY |
Yi (01.AI) |
Base URL overrides (optional)
Override the default API endpoint for each provider. Useful for proxies or custom deployments.
| Variable | Default |
|---|---|
OPENAI_BASE_URL |
(uses OpenAI SDK default) |
ANTHROPIC_BASE_URL |
(uses Anthropic SDK default) |
GOOGLE_BASE_URL |
(uses Google GenAI SDK default) |
DASHSCOPE_BASE_URL |
https://dashscope.aliyuncs.com/compatible-mode/v1 |
YI_BASE_URL |
https://api.01.ai/v1 |
Model ID overrides (optional)
Override the default model ID. Useful for switching to newer model versions without code changes.
| Variable | Default |
|---|---|
OPENAI_MODEL_ID |
gpt-4o |
OPENAI_MINI_MODEL_ID |
gpt-4o-mini |
ANTHROPIC_MODEL_ID |
claude-sonnet-4-20250514 |
GOOGLE_MODEL_ID |
gemini-2.0-flash |
DASHSCOPE_MODEL_ID |
qwen-plus |
YI_MODEL_ID |
yi-large |
How it works
- Select a reference model from the dropdown β Base URL and Model ID auto-fill from env vars (or registry defaults)
- Edit Base URL / Model ID if needed (changes apply to current session only)
- Enter your prompt and click Send to Both
- Grade and comment on each response, then Submit Evaluation
Deployment
Deploy on HuggingFace Spaces with Gradio SDK. Set the API keys and optional overrides as repository secrets in Settings.