| | --- |
| | license: mit |
| | tags: |
| | - tutorial |
| | - crazyrouter |
| | - model-comparison |
| | - benchmark |
| | - llm |
| | - evaluation |
| | language: |
| | - en |
| | - zh |
| | --- |
| | |
| | # βοΈ AI Model Comparison with Crazyrouter |
| |
|
| | > Compare GPT-4o vs Claude vs Gemini vs DeepSeek β same prompt, same API, side by side. |
| |
|
| | One of the biggest advantages of [Crazyrouter](https://crazyrouter.com/?utm_source=huggingface&utm_medium=tutorial&utm_campaign=dev_community) is the ability to test multiple models instantly. No separate accounts, no different SDKs. Just change the model name. |
| |
|
| | --- |
| |
|
| | ## Quick Comparison Script |
| |
|
| | ```python |
| | from openai import OpenAI |
| | import time |
| | |
| | client = OpenAI( |
| | base_url="https://crazyrouter.com/v1", |
| | api_key="sk-your-crazyrouter-key" |
| | ) |
| | |
| | MODELS = [ |
| | "gpt-4o", |
| | "gpt-4o-mini", |
| | "claude-sonnet-4-20250514", |
| | "claude-haiku-3.5", |
| | "gemini-2.0-flash", |
| | "deepseek-chat", |
| | "deepseek-reasoner", |
| | ] |
| | |
| | PROMPT = "Explain the difference between TCP and UDP in exactly 3 sentences." |
| | |
| | print(f"Prompt: {PROMPT}\n") |
| | print("=" * 60) |
| | |
| | for model in MODELS: |
| | try: |
| | start = time.time() |
| | response = client.chat.completions.create( |
| | model=model, |
| | messages=[{"role": "user", "content": PROMPT}], |
| | max_tokens=200 |
| | ) |
| | elapsed = time.time() - start |
| | content = response.choices[0].message.content |
| | tokens = response.usage.total_tokens |
| | |
| | print(f"\nπ€ {model}") |
| | print(f"β±οΈ {elapsed:.2f}s | π {tokens} tokens") |
| | print(f"π¬ {content}") |
| | print("-" * 60) |
| | except Exception as e: |
| | print(f"\nβ {model}: {e}") |
| | print("-" * 60) |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Benchmark: Speed Test |
| |
|
| | ```python |
| | import time |
| | from openai import OpenAI |
| | |
| | client = OpenAI( |
| | base_url="https://crazyrouter.com/v1", |
| | api_key="sk-your-crazyrouter-key" |
| | ) |
| | |
| | def benchmark(model, prompt, runs=3): |
| | times = [] |
| | for _ in range(runs): |
| | start = time.time() |
| | client.chat.completions.create( |
| | model=model, |
| | messages=[{"role": "user", "content": prompt}], |
| | max_tokens=100 |
| | ) |
| | times.append(time.time() - start) |
| | avg = sum(times) / len(times) |
| | return avg |
| | |
| | models = ["gpt-4o-mini", "claude-haiku-3.5", "gemini-2.0-flash", "deepseek-chat"] |
| | prompt = "What is 2+2? Reply with just the number." |
| | |
| | print("Speed Benchmark (avg of 3 runs)") |
| | print("=" * 40) |
| | for m in models: |
| | avg = benchmark(m, prompt) |
| | print(f"{m:30s} {avg:.2f}s") |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Coding Comparison |
| |
|
| | ```python |
| | CODING_PROMPT = """Write a Python function that: |
| | 1. Takes a list of integers |
| | 2. Returns the longest increasing subsequence |
| | 3. Include type hints and a docstring |
| | """ |
| | |
| | CODING_MODELS = [ |
| | "gpt-4o", |
| | "claude-sonnet-4-20250514", |
| | "deepseek-chat", |
| | "gemini-2.0-flash", |
| | ] |
| | |
| | for model in CODING_MODELS: |
| | response = client.chat.completions.create( |
| | model=model, |
| | messages=[{"role": "user", "content": CODING_PROMPT}], |
| | max_tokens=500 |
| | ) |
| | print(f"\n{'='*60}") |
| | print(f"π€ {model}") |
| | print(f"{'='*60}") |
| | print(response.choices[0].message.content) |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Reasoning Comparison |
| |
|
| | Test models that support chain-of-thought reasoning: |
| |
|
| | ```python |
| | REASONING_PROMPT = """A farmer has 17 sheep. All but 9 die. How many sheep are left? |
| | Think step by step.""" |
| | |
| | REASONING_MODELS = [ |
| | "gpt-4o", |
| | "o3-mini", |
| | "deepseek-reasoner", |
| | "claude-sonnet-4-20250514", |
| | ] |
| | |
| | for model in REASONING_MODELS: |
| | response = client.chat.completions.create( |
| | model=model, |
| | messages=[{"role": "user", "content": REASONING_PROMPT}], |
| | max_tokens=300 |
| | ) |
| | print(f"\nπ€ {model}: {response.choices[0].message.content[:200]}") |
| | ``` |
| |
|
| | --- |
| |
|
| | ## Cost Comparison |
| |
|
| | ```python |
| | # Approximate pricing per 1M tokens (input/output) |
| | PRICING = { |
| | "gpt-4o": {"input": 2.50, "output": 10.00}, |
| | "gpt-4o-mini": {"input": 0.15, "output": 0.60}, |
| | "claude-sonnet-4-20250514": {"input": 3.00, "output": 15.00}, |
| | "claude-haiku-3.5": {"input": 0.80, "output": 4.00}, |
| | "gemini-2.0-flash": {"input": 0.10, "output": 0.40}, |
| | "deepseek-chat": {"input": 0.14, "output": 0.28}, |
| | } |
| | |
| | def estimate_cost(model, input_tokens, output_tokens): |
| | p = PRICING.get(model, {"input": 0, "output": 0}) |
| | return (input_tokens * p["input"] + output_tokens * p["output"]) / 1_000_000 |
| | |
| | # Example: 1000 requests, avg 500 input + 200 output tokens each |
| | requests = 1000 |
| | input_tok = 500 |
| | output_tok = 200 |
| | |
| | print(f"Cost estimate for {requests} requests ({input_tok} in / {output_tok} out tokens each):\n") |
| | for model, price in PRICING.items(): |
| | cost = requests * estimate_cost(model, input_tok, output_tok) |
| | print(f" {model:30s} ${cost:.4f}") |
| | ``` |
| |
|
| | --- |
| |
|
| | ## When to Use Which Model |
| |
|
| | | Use Case | Recommended Model | Why | |
| | |----------|------------------|-----| |
| | | General chat | `gpt-4o-mini` | Fast, cheap, good quality | |
| | | Complex analysis | `gpt-4o` or `claude-sonnet-4-20250514` | Best reasoning | |
| | | Coding | `deepseek-chat` or `claude-sonnet-4-20250514` | Strong code generation | |
| | | Long documents | `gemini-2.0-flash` | 1M token context | |
| | | Math/Logic | `deepseek-reasoner` or `o3-mini` | Chain-of-thought | |
| | | Budget tasks | `deepseek-chat` | $0.14/1M input | |
| | | Speed critical | `gemini-2.0-flash` | Fastest response | |
| |
|
| | --- |
| |
|
| | ## Try It Live |
| |
|
| | π [Crazyrouter Demo on Hugging Face](https://huggingface.co/spaces/xujfcn/Crazyrouter-Demo) β switch models in real-time |
| |
|
| | --- |
| |
|
| | ## Links |
| |
|
| | - π [Crazyrouter](https://crazyrouter.com/?utm_source=huggingface&utm_medium=tutorial&utm_campaign=dev_community) |
| | - π [Getting Started](https://huggingface.co/xujfcn/Crazyrouter-Getting-Started) |
| | - π [LangChain Guide](https://huggingface.co/xujfcn/Crazyrouter-LangChain-Guide) |
| | - π° [Pricing](https://huggingface.co/spaces/xujfcn/Crazyrouter-Pricing) |
| | - π¬ [Telegram](https://t.me/crazyrouter) |
| | - π¦ [Twitter @metaviiii](https://twitter.com/metaviiii) |
| |
|