Spaces:
Running
Running
A newer version of the Gradio SDK is available: 6.12.0
metadata
title: Multi Llm Compare
emoji: π
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
license: mit
short_description: Compare the output of the models with price estimations
π€ Multi-LLM Comparison Tool
Compare responses from multiple Large Language Models (LLMs) side-by-side with custom parameters, timing metrics, and cost estimates.
β¨ Features
- Multi-Provider Support: OpenAI, Anthropic, Google (Gemini), Cohere, and Mistral
- Dynamic Model Selection: Add multiple models with a simple + button interface
- Custom Parameters: Configure temperature, top_p, max_tokens, and provider-specific parameters
- Performance Metrics: Track response time for each model
- Cost Estimation: Calculate estimated cost per 1000 API calls
- Parallel Execution: All models are queried simultaneously for faster results
- CSV Export: Export comparison results for further analysis
- Unique Model Selection: Prevents duplicate model configurations
π Getting Started
Installation
pip install -r requirements.txt
Run Locally
python app.py
The application will launch in your browser at http://localhost:7860
π How to Use
1. Select Models
- Choose a Provider (OpenAI, Anthropic, Google, etc.)
- Select a Model from the dropdown
- Configure Parameters:
- Temperature (0-2): Controls randomness
- Top P (0-1): Nucleus sampling parameter
- Max Tokens: Maximum response length
- Top K: (Model-specific) Number of top tokens to consider
- Frequency/Presence Penalty: (Model-specific) Token repetition control
2. Add Models
- Click the β Add Model button to add the configured model to your comparison
- Repeat to add multiple models with different configurations
- Each model must be unique (same model with different parameters is allowed)
3. Enter API Keys
- Provide API keys for the providers you want to use
- Keys are only required for the providers you've selected
- API keys are not stored and are only used for the current session
4. Run Comparison
- Enter your prompt in the text area
- Click π Run Comparison to query all selected models
- Results will appear in a table showing:
- Model name and parameters
- Response time
- Estimated cost per 1000 calls
- Model output
5. Export Results
- Click π₯ Export to CSV to download the results
- The CSV file includes all comparison data with timestamps
π° Pricing Information
The tool uses current pricing (per 1M tokens) for cost estimation:
OpenAI
- GPT-4o: $2.50 input / $10.00 output
- GPT-4o-mini: $0.15 input / $0.60 output
Anthropic
- Claude 3.5 Sonnet: $3.00 input / $15.00 output
- Claude 3.5 Haiku: $0.80 input / $4.00 output
- Gemini 1.5 Pro: $1.25 input / $5.00 output
- Gemini 1.5 Flash: $0.075 input / $0.30 output
Note: Prices are estimates and may vary. Check provider documentation for current rates.
π API Keys
You need API keys from the providers you want to use:
- OpenAI: https://platform.openai.com/api-keys
- Anthropic: https://console.anthropic.com/
- Google: https://aistudio.google.com/app/api-keys
- Cohere: https://dashboard.cohere.com/api-keys
- Mistral: https://console.mistral.ai/
π Supported Parameters by Provider
| Parameter | OpenAI | Anthropic | Cohere | Mistral | |
|---|---|---|---|---|---|
| Temperature | β | β | β | β | β |
| Top P | β | β | β | β | β |
| Max Tokens | β | β | β | β | β |
| Top K | β | β | β | β | β |
| Frequency Penalty | β | β | β | β | β |
| Presence Penalty | β | β | β | β | β |
π οΈ Technical Details
- Framework: Gradio 4.0+
- Async Support: All API calls are executed in parallel using asyncio
- Error Handling: Graceful error messages for API failures
- Token Estimation: Accurate for OpenAI/Anthropic, estimated for others
π License
MIT License - See LICENSE file for details
π€ Contributing
Contributions are welcome! Feel free to submit issues or pull requests.
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference