Spaces:

crossingk
/

llm_compare

Sleeping

App Files Files Community

crossingk commited on Mar 25

Commit

ec2f575

verified ·

1 Parent(s): 7098a34

Update README.md

Browse files

Files changed (1) hide show

README.md +40 -6

README.md CHANGED Viewed

@@ -11,12 +11,13 @@ pinned: false
 # LLM Compare
-A Gradio web app for side-by-side LLM comparison. Compare your own model (via any OpenAI-compatible API endpoint) against reference models from OpenAI, Anthropic, Google Gemini, Qwen, and Yi.
 ## Features
-- **Two-column layout**: Your custom model on the left, a selectable reference model on the right
 - **Multiple providers**: OpenAI (GPT-4o), Anthropic (Claude), Google Gemini, Qwen, Yi
 - **Evaluation workflow**: Comment and grade (1–10) each model's response
 - **Nickname tracking**: All evaluations tagged with user nickname
 - **Excel export**: Download all evaluation data as `.xlsx`
@@ -30,19 +31,52 @@ python app.py
 ## Environment Variables
-Set these as env vars (or HuggingFace Spaces secrets) to provide default API keys:
 | Variable | Provider |
 |---|---|
 | `OPENAI_API_KEY` | OpenAI |
 | `ANTHROPIC_API_KEY` | Anthropic |
 | `GOOGLE_API_KEY` | Google Gemini |
-| `DASHSCOPE_API_KEY` | Qwen (DashScope) |
 | `YI_API_KEY` | Yi (01.AI) |
-Users can override keys per-session in the UI. Keys are never stored.
 ## Deployment
-Deploy on HuggingFace Spaces with Gradio SDK. Set the API keys as repository secrets.

 # LLM Compare
+A Gradio web app for side-by-side LLM comparison. Compare your Dify application against reference models from OpenAI, Anthropic, Google Gemini, Qwen, and Yi.
 ## Features
+- **Two-column layout**: Your Dify app on the left, a selectable reference model on the right
 - **Multiple providers**: OpenAI (GPT-4o), Anthropic (Claude), Google Gemini, Qwen, Yi
+- **Overridable defaults**: Base URL and Model ID auto-fill from env vars but can be edited per-session
 - **Evaluation workflow**: Comment and grade (1–10) each model's response
 - **Nickname tracking**: All evaluations tagged with user nickname
 - **Excel export**: Download all evaluation data as `.xlsx`
 ## Environment Variables
+Set these as **Hugging Face Spaces secrets** (Settings → Repository secrets) to provide defaults.
+Users can override Base URL / Model ID in the UI at runtime.
+### API Keys (required for each provider you use)
 | Variable | Provider |
 |---|---|
 | `OPENAI_API_KEY` | OpenAI |
 | `ANTHROPIC_API_KEY` | Anthropic |
 | `GOOGLE_API_KEY` | Google Gemini |
+| `DASHSCOPE_API_KEY` | Qwen (DashScope / Alibaba) |
 | `YI_API_KEY` | Yi (01.AI) |
+### Base URL overrides (optional)
+Override the default API endpoint for each provider. Useful for proxies or custom deployments.
+| Variable | Default |
+|---|---|
+| `OPENAI_BASE_URL` | *(uses OpenAI SDK default)* |
+| `ANTHROPIC_BASE_URL` | *(uses Anthropic SDK default)* |
+| `GOOGLE_BASE_URL` | *(uses Google GenAI SDK default)* |
+| `DASHSCOPE_BASE_URL` | `https://dashscope.aliyuncs.com/compatible-mode/v1` |
+| `YI_BASE_URL` | `https://api.01.ai/v1` |
+### Model ID overrides (optional)
+Override the default model ID. Useful for switching to newer model versions without code changes.
+| Variable | Default |
+|---|---|
+| `OPENAI_MODEL_ID` | `gpt-4o` |
+| `OPENAI_MINI_MODEL_ID` | `gpt-4o-mini` |
+| `ANTHROPIC_MODEL_ID` | `claude-sonnet-4-20250514` |
+| `GOOGLE_MODEL_ID` | `gemini-2.0-flash` |
+| `DASHSCOPE_MODEL_ID` | `qwen-plus` |
+| `YI_MODEL_ID` | `yi-large` |
+## How it works
+1. Select a reference model from the dropdown — **Base URL** and **Model ID** auto-fill from env vars (or registry defaults)
+2. Edit Base URL / Model ID if needed (changes apply to current session only)
+3. Enter your prompt and click **Send to Both**
+4. Grade and comment on each response, then **Submit Evaluation**
 ## Deployment
+Deploy on HuggingFace Spaces with Gradio SDK. Set the API keys and optional overrides as repository secrets in Settings.