rdisipio
update deps
c9873aa
---
title: Multi-Model LLM Comparison Tool
emoji: πŸ€–
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "6.3.0"
app_file: app.py
pinned: false
---
# Multi-Model LLM Comparison Tool
UX-first interface to choose the best answer among different LLM models.
## Overview
An exploratory tool for comparing responses from multiple Large Language Models (LLMs) side-by-side. Ask a question, select models, and choose which response works best for you. Built with privacy in mind - all data collection is opt-in only.
## Features
- πŸ€– **Multi-Model Comparison**: Query multiple LLMs simultaneously via Groq API
- βš–οΈ **Side-by-Side View**: Compare responses in a clean, readable format
- πŸ“ **Adjustable Output Length**: Choose Short, Medium, or Full responses
- πŸ‘ **Preference Selection**: Mark which response works best for you
- πŸ”’ **Privacy-First**: Opt-in data collection only
- 🎨 **Clean UX**: Minimalist, user-friendly interface
## Supported Models
- GPT-4 (via Llama 3.3 70B on Groq)
- Claude 3 Opus (via Llama 3.1 70B on Groq)
- Gemini Pro (via Mixtral 8x7B)
- Llama 2 70B
- Mistral Large (via Mixtral 8x7B)
*Note: Model names are mapped to available Groq models for demonstration purposes*
## Setup
### Local Development
1. **Clone the repository**
```bash
git clone https://huggingface.co/spaces/rdisipio/multimodel-preference-tool
cd multimodel-preference-tool
```
2. **Install dependencies**
```bash
pip install -r requirements.txt
```
3. **Set up environment variables**
```bash
cp .env.example .env
# Edit .env and add your Groq API key
```
4. **Get your Groq API key**
- Sign up at [Groq Console](https://console.groq.com/)
- Generate an API key
- Add it to your `.env` file
5. **Run the app**
```bash
python app.py
```
### Hugging Face Spaces Deployment
This app is designed to run on Hugging Face Spaces with Gradio.
1. **Set up your Space**
- Create a new Gradio Space on Hugging Face
- Upload `app.py`, `requirements.txt`, and `README.md`
2. **Configure Secrets**
- In your Space settings, add a secret:
- Name: `GROQ_API_KEY`
- Value: Your Groq API key
3. **Deploy**
- Your Space will automatically build and deploy
## Usage
1. **Enter your question** in the text area
2. **Select output length**: Short, Medium, or Full
3. **Choose models** to compare (select at least one)
4. **Click "Compare answers"** to see responses
5. **Review responses** side-by-side
6. **Click "This one works for me"** to record your preference (optional)
## Privacy
- **No data collection by default**: Your questions and responses stay private
- **Opt-in preferences**: Only recorded when you click preference buttons
- **Transparent**: All data handling is visible in the open-source code
- **Local-first**: Run locally for complete privacy
## Project Background
This tool is part of an exploration in open human feedback collection for LLM evaluation. The goal is to create a lightweight, UX-first interface that makes it easy for users to compare model outputs and provide feedback.
For more details, see the project documentation in the repository.
## Built By
Human Feedback Foundation
Linux Foundation AI & Data member
## License
See [LICENSE](LICENSE) file for details.