Spaces:
Sleeping
Sleeping
Update README.md
Browse files
README.md
CHANGED
|
@@ -11,12 +11,13 @@ pinned: false
|
|
| 11 |
|
| 12 |
# LLM Compare
|
| 13 |
|
| 14 |
-
A Gradio web app for side-by-side LLM comparison. Compare your
|
| 15 |
|
| 16 |
## Features
|
| 17 |
|
| 18 |
-
- **Two-column layout**: Your
|
| 19 |
- **Multiple providers**: OpenAI (GPT-4o), Anthropic (Claude), Google Gemini, Qwen, Yi
|
|
|
|
| 20 |
- **Evaluation workflow**: Comment and grade (1–10) each model's response
|
| 21 |
- **Nickname tracking**: All evaluations tagged with user nickname
|
| 22 |
- **Excel export**: Download all evaluation data as `.xlsx`
|
|
@@ -30,19 +31,52 @@ python app.py
|
|
| 30 |
|
| 31 |
## Environment Variables
|
| 32 |
|
| 33 |
-
Set these as
|
|
|
|
|
|
|
|
|
|
| 34 |
|
| 35 |
| Variable | Provider |
|
| 36 |
|---|---|
|
| 37 |
| `OPENAI_API_KEY` | OpenAI |
|
| 38 |
| `ANTHROPIC_API_KEY` | Anthropic |
|
| 39 |
| `GOOGLE_API_KEY` | Google Gemini |
|
| 40 |
-
| `DASHSCOPE_API_KEY` | Qwen (DashScope) |
|
| 41 |
| `YI_API_KEY` | Yi (01.AI) |
|
| 42 |
|
| 43 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
|
| 45 |
## Deployment
|
| 46 |
|
| 47 |
-
Deploy on HuggingFace Spaces with Gradio SDK. Set the API keys as repository secrets.
|
| 48 |
|
|
|
|
| 11 |
|
| 12 |
# LLM Compare
|
| 13 |
|
| 14 |
+
A Gradio web app for side-by-side LLM comparison. Compare your Dify application against reference models from OpenAI, Anthropic, Google Gemini, Qwen, and Yi.
|
| 15 |
|
| 16 |
## Features
|
| 17 |
|
| 18 |
+
- **Two-column layout**: Your Dify app on the left, a selectable reference model on the right
|
| 19 |
- **Multiple providers**: OpenAI (GPT-4o), Anthropic (Claude), Google Gemini, Qwen, Yi
|
| 20 |
+
- **Overridable defaults**: Base URL and Model ID auto-fill from env vars but can be edited per-session
|
| 21 |
- **Evaluation workflow**: Comment and grade (1–10) each model's response
|
| 22 |
- **Nickname tracking**: All evaluations tagged with user nickname
|
| 23 |
- **Excel export**: Download all evaluation data as `.xlsx`
|
|
|
|
| 31 |
|
| 32 |
## Environment Variables
|
| 33 |
|
| 34 |
+
Set these as **Hugging Face Spaces secrets** (Settings → Repository secrets) to provide defaults.
|
| 35 |
+
Users can override Base URL / Model ID in the UI at runtime.
|
| 36 |
+
|
| 37 |
+
### API Keys (required for each provider you use)
|
| 38 |
|
| 39 |
| Variable | Provider |
|
| 40 |
|---|---|
|
| 41 |
| `OPENAI_API_KEY` | OpenAI |
|
| 42 |
| `ANTHROPIC_API_KEY` | Anthropic |
|
| 43 |
| `GOOGLE_API_KEY` | Google Gemini |
|
| 44 |
+
| `DASHSCOPE_API_KEY` | Qwen (DashScope / Alibaba) |
|
| 45 |
| `YI_API_KEY` | Yi (01.AI) |
|
| 46 |
|
| 47 |
+
### Base URL overrides (optional)
|
| 48 |
+
|
| 49 |
+
Override the default API endpoint for each provider. Useful for proxies or custom deployments.
|
| 50 |
+
|
| 51 |
+
| Variable | Default |
|
| 52 |
+
|---|---|
|
| 53 |
+
| `OPENAI_BASE_URL` | *(uses OpenAI SDK default)* |
|
| 54 |
+
| `ANTHROPIC_BASE_URL` | *(uses Anthropic SDK default)* |
|
| 55 |
+
| `GOOGLE_BASE_URL` | *(uses Google GenAI SDK default)* |
|
| 56 |
+
| `DASHSCOPE_BASE_URL` | `https://dashscope.aliyuncs.com/compatible-mode/v1` |
|
| 57 |
+
| `YI_BASE_URL` | `https://api.01.ai/v1` |
|
| 58 |
+
|
| 59 |
+
### Model ID overrides (optional)
|
| 60 |
+
|
| 61 |
+
Override the default model ID. Useful for switching to newer model versions without code changes.
|
| 62 |
+
|
| 63 |
+
| Variable | Default |
|
| 64 |
+
|---|---|
|
| 65 |
+
| `OPENAI_MODEL_ID` | `gpt-4o` |
|
| 66 |
+
| `OPENAI_MINI_MODEL_ID` | `gpt-4o-mini` |
|
| 67 |
+
| `ANTHROPIC_MODEL_ID` | `claude-sonnet-4-20250514` |
|
| 68 |
+
| `GOOGLE_MODEL_ID` | `gemini-2.0-flash` |
|
| 69 |
+
| `DASHSCOPE_MODEL_ID` | `qwen-plus` |
|
| 70 |
+
| `YI_MODEL_ID` | `yi-large` |
|
| 71 |
+
|
| 72 |
+
## How it works
|
| 73 |
+
|
| 74 |
+
1. Select a reference model from the dropdown — **Base URL** and **Model ID** auto-fill from env vars (or registry defaults)
|
| 75 |
+
2. Edit Base URL / Model ID if needed (changes apply to current session only)
|
| 76 |
+
3. Enter your prompt and click **Send to Both**
|
| 77 |
+
4. Grade and comment on each response, then **Submit Evaluation**
|
| 78 |
|
| 79 |
## Deployment
|
| 80 |
|
| 81 |
+
Deploy on HuggingFace Spaces with Gradio SDK. Set the API keys and optional overrides as repository secrets in Settings.
|
| 82 |
|