xujfcn
/

Crazyrouter-Model-Comparison

+---
+license: mit
+tags:
+  - tutorial
+  - crazyrouter
+  - model-comparison
+  - benchmark
+  - llm
+  - evaluation
+language:
+  - en
+  - zh
+---
+# ⚖️ AI Model Comparison with Crazyrouter
+> Compare GPT-4o vs Claude vs Gemini vs DeepSeek — same prompt, same API, side by side.
+One of the biggest advantages of [Crazyrouter](https://crazyrouter.com) is the ability to test multiple models instantly. No separate accounts, no different SDKs. Just change the model name.
+---
+## Quick Comparison Script
+```python
+from openai import OpenAI
+import time
+client = OpenAI(
+    base_url="https://crazyrouter.com/v1",
+    api_key="sk-your-crazyrouter-key"
+)
+MODELS = [
+    "gpt-4o",
+    "gpt-4o-mini",
+    "claude-sonnet-4-20250514",
+    "claude-haiku-3.5",
+    "gemini-2.0-flash",
+    "deepseek-chat",
+    "deepseek-reasoner",
+]
+PROMPT = "Explain the difference between TCP and UDP in exactly 3 sentences."
+print(f"Prompt: {PROMPT}\n")
+print("=" * 60)
+for model in MODELS:
+    try:
+        start = time.time()
+        response = client.chat.completions.create(
+            model=model,
+            messages=[{"role": "user", "content": PROMPT}],
+            max_tokens=200
+        )
+        elapsed = time.time() - start
+        content = response.choices[0].message.content
+        tokens = response.usage.total_tokens
+        print(f"\n🤖 {model}")
+        print(f"⏱️  {elapsed:.2f}s | 📊 {tokens} tokens")
+        print(f"💬 {content}")
+        print("-" * 60)
+    except Exception as e:
+        print(f"\n❌ {model}: {e}")
+        print("-" * 60)
+```
+---
+## Benchmark: Speed Test
+```python
+import time
+from openai import OpenAI
+client = OpenAI(
+    base_url="https://crazyrouter.com/v1",
+    api_key="sk-your-crazyrouter-key"
+)
+def benchmark(model, prompt, runs=3):
+    times = []
+    for _ in range(runs):
+        start = time.time()
+        client.chat.completions.create(
+            model=model,
+            messages=[{"role": "user", "content": prompt}],
+            max_tokens=100
+        )
+        times.append(time.time() - start)
+    avg = sum(times) / len(times)
+    return avg
+models = ["gpt-4o-mini", "claude-haiku-3.5", "gemini-2.0-flash", "deepseek-chat"]
+prompt = "What is 2+2? Reply with just the number."
+print("Speed Benchmark (avg of 3 runs)")
+print("=" * 40)
+for m in models:
+    avg = benchmark(m, prompt)
+    print(f"{m:30s} {avg:.2f}s")
+```
+---
+## Coding Comparison
+```python
+CODING_PROMPT = """Write a Python function that:
+1. Takes a list of integers
+2. Returns the longest increasing subsequence
+3. Include type hints and a docstring
+"""
+CODING_MODELS = [
+    "gpt-4o",
+    "claude-sonnet-4-20250514",
+    "deepseek-chat",
+    "gemini-2.0-flash",
+]
+for model in CODING_MODELS:
+    response = client.chat.completions.create(
+        model=model,
+        messages=[{"role": "user", "content": CODING_PROMPT}],
+        max_tokens=500
+    )
+    print(f"\n{'='*60}")
+    print(f"🤖 {model}")
+    print(f"{'='*60}")
+    print(response.choices[0].message.content)
+```
+---
+## Reasoning Comparison
+Test models that support chain-of-thought reasoning:
+```python
+REASONING_PROMPT = """A farmer has 17 sheep. All but 9 die. How many sheep are left?
+Think step by step."""
+REASONING_MODELS = [
+    "gpt-4o",
+    "o3-mini",
+    "deepseek-reasoner",
+    "claude-sonnet-4-20250514",
+]
+for model in REASONING_MODELS:
+    response = client.chat.completions.create(
+        model=model,
+        messages=[{"role": "user", "content": REASONING_PROMPT}],
+        max_tokens=300
+    )
+    print(f"\n🤖 {model}: {response.choices[0].message.content[:200]}")
+```
+---
+## Cost Comparison
+```python
+# Approximate pricing per 1M tokens (input/output)
+PRICING = {
+    "gpt-4o":           {"input": 2.50, "output": 10.00},
+    "gpt-4o-mini":      {"input": 0.15, "output": 0.60},
+    "claude-sonnet-4-20250514": {"input": 3.00, "output": 15.00},
+    "claude-haiku-3.5": {"input": 0.80, "output": 4.00},
+    "gemini-2.0-flash": {"input": 0.10, "output": 0.40},
+    "deepseek-chat":    {"input": 0.14, "output": 0.28},
+}
+def estimate_cost(model, input_tokens, output_tokens):
+    p = PRICING.get(model, {"input": 0, "output": 0})
+    return (input_tokens * p["input"] + output_tokens * p["output"]) / 1_000_000
+# Example: 1000 requests, avg 500 input + 200 output tokens each
+requests = 1000
+input_tok = 500
+output_tok = 200
+print(f"Cost estimate for {requests} requests ({input_tok} in / {output_tok} out tokens each):\n")
+for model, price in PRICING.items():
+    cost = requests * estimate_cost(model, input_tok, output_tok)
+    print(f"  {model:30s} ${cost:.4f}")
+```
+---
+## When to Use Which Model
+| Use Case | Recommended Model | Why |
+|----------|------------------|-----|
+| General chat | `gpt-4o-mini` | Fast, cheap, good quality |
+| Complex analysis | `gpt-4o` or `claude-sonnet-4-20250514` | Best reasoning |
+| Coding | `deepseek-chat` or `claude-sonnet-4-20250514` | Strong code generation |
+| Long documents | `gemini-2.0-flash` | 1M token context |
+| Math/Logic | `deepseek-reasoner` or `o3-mini` | Chain-of-thought |
+| Budget tasks | `deepseek-chat` | $0.14/1M input |
+| Speed critical | `gemini-2.0-flash` | Fastest response |
+---
+## Try It Live
+👉 [Crazyrouter Demo on Hugging Face](https://huggingface.co/spaces/xujfcn/Crazyrouter-Demo) — switch models in real-time
+---
+## Links
+- 🌐 [Crazyrouter](https://crazyrouter.com)
+- 📖 [Getting Started](https://huggingface.co/xujfcn/Crazyrouter-Getting-Started)
+- 🔗 [LangChain Guide](https://huggingface.co/xujfcn/Crazyrouter-LangChain-Guide)
+- 💰 [Pricing](https://huggingface.co/spaces/xujfcn/Crazyrouter-Pricing)
+- 💬 [Telegram](https://t.me/crazyrouter)
+- 🐦 [Twitter @metaviiii](https://twitter.com/metaviiii)