crossingk commited on
Commit
ec2f575
·
verified ·
1 Parent(s): 7098a34

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +40 -6
README.md CHANGED
@@ -11,12 +11,13 @@ pinned: false
11
 
12
  # LLM Compare
13
 
14
- A Gradio web app for side-by-side LLM comparison. Compare your own model (via any OpenAI-compatible API endpoint) against reference models from OpenAI, Anthropic, Google Gemini, Qwen, and Yi.
15
 
16
  ## Features
17
 
18
- - **Two-column layout**: Your custom model on the left, a selectable reference model on the right
19
  - **Multiple providers**: OpenAI (GPT-4o), Anthropic (Claude), Google Gemini, Qwen, Yi
 
20
  - **Evaluation workflow**: Comment and grade (1–10) each model's response
21
  - **Nickname tracking**: All evaluations tagged with user nickname
22
  - **Excel export**: Download all evaluation data as `.xlsx`
@@ -30,19 +31,52 @@ python app.py
30
 
31
  ## Environment Variables
32
 
33
- Set these as env vars (or HuggingFace Spaces secrets) to provide default API keys:
 
 
 
34
 
35
  | Variable | Provider |
36
  |---|---|
37
  | `OPENAI_API_KEY` | OpenAI |
38
  | `ANTHROPIC_API_KEY` | Anthropic |
39
  | `GOOGLE_API_KEY` | Google Gemini |
40
- | `DASHSCOPE_API_KEY` | Qwen (DashScope) |
41
  | `YI_API_KEY` | Yi (01.AI) |
42
 
43
- Users can override keys per-session in the UI. Keys are never stored.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
 
45
  ## Deployment
46
 
47
- Deploy on HuggingFace Spaces with Gradio SDK. Set the API keys as repository secrets.
48
 
 
11
 
12
  # LLM Compare
13
 
14
+ A Gradio web app for side-by-side LLM comparison. Compare your Dify application against reference models from OpenAI, Anthropic, Google Gemini, Qwen, and Yi.
15
 
16
  ## Features
17
 
18
+ - **Two-column layout**: Your Dify app on the left, a selectable reference model on the right
19
  - **Multiple providers**: OpenAI (GPT-4o), Anthropic (Claude), Google Gemini, Qwen, Yi
20
+ - **Overridable defaults**: Base URL and Model ID auto-fill from env vars but can be edited per-session
21
  - **Evaluation workflow**: Comment and grade (1–10) each model's response
22
  - **Nickname tracking**: All evaluations tagged with user nickname
23
  - **Excel export**: Download all evaluation data as `.xlsx`
 
31
 
32
  ## Environment Variables
33
 
34
+ Set these as **Hugging Face Spaces secrets** (Settings Repository secrets) to provide defaults.
35
+ Users can override Base URL / Model ID in the UI at runtime.
36
+
37
+ ### API Keys (required for each provider you use)
38
 
39
  | Variable | Provider |
40
  |---|---|
41
  | `OPENAI_API_KEY` | OpenAI |
42
  | `ANTHROPIC_API_KEY` | Anthropic |
43
  | `GOOGLE_API_KEY` | Google Gemini |
44
+ | `DASHSCOPE_API_KEY` | Qwen (DashScope / Alibaba) |
45
  | `YI_API_KEY` | Yi (01.AI) |
46
 
47
+ ### Base URL overrides (optional)
48
+
49
+ Override the default API endpoint for each provider. Useful for proxies or custom deployments.
50
+
51
+ | Variable | Default |
52
+ |---|---|
53
+ | `OPENAI_BASE_URL` | *(uses OpenAI SDK default)* |
54
+ | `ANTHROPIC_BASE_URL` | *(uses Anthropic SDK default)* |
55
+ | `GOOGLE_BASE_URL` | *(uses Google GenAI SDK default)* |
56
+ | `DASHSCOPE_BASE_URL` | `https://dashscope.aliyuncs.com/compatible-mode/v1` |
57
+ | `YI_BASE_URL` | `https://api.01.ai/v1` |
58
+
59
+ ### Model ID overrides (optional)
60
+
61
+ Override the default model ID. Useful for switching to newer model versions without code changes.
62
+
63
+ | Variable | Default |
64
+ |---|---|
65
+ | `OPENAI_MODEL_ID` | `gpt-4o` |
66
+ | `OPENAI_MINI_MODEL_ID` | `gpt-4o-mini` |
67
+ | `ANTHROPIC_MODEL_ID` | `claude-sonnet-4-20250514` |
68
+ | `GOOGLE_MODEL_ID` | `gemini-2.0-flash` |
69
+ | `DASHSCOPE_MODEL_ID` | `qwen-plus` |
70
+ | `YI_MODEL_ID` | `yi-large` |
71
+
72
+ ## How it works
73
+
74
+ 1. Select a reference model from the dropdown — **Base URL** and **Model ID** auto-fill from env vars (or registry defaults)
75
+ 2. Edit Base URL / Model ID if needed (changes apply to current session only)
76
+ 3. Enter your prompt and click **Send to Both**
77
+ 4. Grade and comment on each response, then **Submit Evaluation**
78
 
79
  ## Deployment
80
 
81
+ Deploy on HuggingFace Spaces with Gradio SDK. Set the API keys and optional overrides as repository secrets in Settings.
82