llm_compare / README.md
crossingk's picture
Update README.md
ec2f575 verified
---
title: LLM Compare
emoji: πŸ”
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "6.9.0"
app_file: app.py
pinned: false
---
# LLM Compare
A Gradio web app for side-by-side LLM comparison. Compare your Dify application against reference models from OpenAI, Anthropic, Google Gemini, Qwen, and Yi.
## Features
- **Two-column layout**: Your Dify app on the left, a selectable reference model on the right
- **Multiple providers**: OpenAI (GPT-4o), Anthropic (Claude), Google Gemini, Qwen, Yi
- **Overridable defaults**: Base URL and Model ID auto-fill from env vars but can be edited per-session
- **Evaluation workflow**: Comment and grade (1–10) each model's response
- **Nickname tracking**: All evaluations tagged with user nickname
- **Excel export**: Download all evaluation data as `.xlsx`
## Setup
```bash
pip install -r requirements.txt
python app.py
```
## Environment Variables
Set these as **Hugging Face Spaces secrets** (Settings β†’ Repository secrets) to provide defaults.
Users can override Base URL / Model ID in the UI at runtime.
### API Keys (required for each provider you use)
| Variable | Provider |
|---|---|
| `OPENAI_API_KEY` | OpenAI |
| `ANTHROPIC_API_KEY` | Anthropic |
| `GOOGLE_API_KEY` | Google Gemini |
| `DASHSCOPE_API_KEY` | Qwen (DashScope / Alibaba) |
| `YI_API_KEY` | Yi (01.AI) |
### Base URL overrides (optional)
Override the default API endpoint for each provider. Useful for proxies or custom deployments.
| Variable | Default |
|---|---|
| `OPENAI_BASE_URL` | *(uses OpenAI SDK default)* |
| `ANTHROPIC_BASE_URL` | *(uses Anthropic SDK default)* |
| `GOOGLE_BASE_URL` | *(uses Google GenAI SDK default)* |
| `DASHSCOPE_BASE_URL` | `https://dashscope.aliyuncs.com/compatible-mode/v1` |
| `YI_BASE_URL` | `https://api.01.ai/v1` |
### Model ID overrides (optional)
Override the default model ID. Useful for switching to newer model versions without code changes.
| Variable | Default |
|---|---|
| `OPENAI_MODEL_ID` | `gpt-4o` |
| `OPENAI_MINI_MODEL_ID` | `gpt-4o-mini` |
| `ANTHROPIC_MODEL_ID` | `claude-sonnet-4-20250514` |
| `GOOGLE_MODEL_ID` | `gemini-2.0-flash` |
| `DASHSCOPE_MODEL_ID` | `qwen-plus` |
| `YI_MODEL_ID` | `yi-large` |
## How it works
1. Select a reference model from the dropdown β€” **Base URL** and **Model ID** auto-fill from env vars (or registry defaults)
2. Edit Base URL / Model ID if needed (changes apply to current session only)
3. Enter your prompt and click **Send to Both**
4. Grade and comment on each response, then **Submit Evaluation**
## Deployment
Deploy on HuggingFace Spaces with Gradio SDK. Set the API keys and optional overrides as repository secrets in Settings.