# Groq API Setup Guide for HuggingClaw ## ⚡ Why Groq? **Groq is the FASTEST inference engine available** - up to 500+ tokens/second! | Feature | Groq | Others | |---------|------|--------| | Speed | ⚡⚡⚡⚡⚡ 500+ t/s | ⚡⚡ 50-100 t/s | | Latency | <100ms | 500ms-2s | | Free Tier | ✅ Yes, generous | ⚠️ Limited | | Models | Llama 3/4, Qwen, Kimi, GPT-OSS | Varies | --- ## ⚠️ SECURITY WARNING **Never share your API key publicly!** If you've shared it: 1. Go to https://console.groq.com/api-keys 2. Delete the compromised key 3. Create a new one 4. Store it securely (password manager, HF Spaces secrets) --- ## Quick Start ### Step 1: Get Your Groq API Key 1. Go to **https://console.groq.com** 2. Sign in or create account (free) 3. Navigate to **API Keys** in left sidebar 4. Click **Create API Key** 5. Copy your key (starts with `gsk_...`) 6. **Keep it secret!** ### Step 2: Configure HuggingFace Spaces In your Space **Settings → Repository secrets**, add: ```bash GROQ_API_KEY=gsk_your-actual-api-key-here OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile ``` ### Step 3: Deploy Push changes or redeploy the Space. Groq will be automatically configured. ### Step 4: Use 1. Open Space URL 2. Enter gateway token (default: `huggingclaw`) 3. Select "Llama 3.3 70B (Versatile)" from model dropdown 4. Experience blazing fast responses! ⚡ --- ## Available Models (Verified 2025) ### Chat Models | Model ID | Name | Context | Speed | Best For | |----------|------|---------|-------|----------| | `llama-3.3-70b-versatile` | Llama 3.3 70B | 128K | ⚡⚡⚡⚡ | **Best overall** | | `llama-3.1-8b-instant` | Llama 3.1 8B | 128K | ⚡⚡⚡⚡⚡ | Ultra-fast | | `meta-llama/llama-4-maverick-17b-128e-instruct` | Llama 4 Maverick | 128K | ⚡⚡⚡⚡ | Latest Llama 4 | | `meta-llama/llama-4-scout-17b-16e-instruct` | Llama 4 Scout | 128K | ⚡⚡⚡⚡ | Latest Llama 4 | | `qwen/qwen3-32b` | Qwen3 32B | 128K | ⚡⚡⚡ | Alibaba model | | `moonshotai/kimi-k2-instruct` | Kimi K2 | 128K | ⚡⚡⚡ | Moonshot AI | | `openai/gpt-oss-20b` | GPT-OSS 20B | 128K | ⚡⚡⚡ | OpenAI open-source | | `allam-2-7b` | Allam-2 7B | 4K | ⚡⚡⚡⚡ | Arabic/English | ### Audio Models | Model ID | Name | Purpose | |----------|------|---------| | `whisper-large-v3-turbo` | Whisper Large V3 Turbo | Speech-to-text | | `whisper-large-v3` | Whisper Large V3 | Speech-to-text | ### Safety Models | Model ID | Name | Purpose | |----------|------|---------| | `meta-llama/llama-guard-4-12b` | Llama Guard 4 | Content moderation | | `meta-llama/llama-prompt-guard-2-86m` | Llama Prompt Guard 2 | Prompt injection detection | --- ## Configuration Options ### Basic Setup (Recommended) ```bash GROQ_API_KEY=gsk_xxxxx OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile ``` ### Multiple Providers Use Groq as primary with fallbacks: ```bash # Groq (primary - fastest) GROQ_API_KEY=gsk_xxxxx # OpenRouter (fallback - more models) OPENROUTER_API_KEY=sk-or-v1-xxxxx # Local Ollama (free backup) LOCAL_MODEL_ENABLED=true LOCAL_MODEL_NAME=neuralnexuslab/hacking ``` Priority order: 1. **Groq** (if `GROQ_API_KEY` set) ← Fastest! 2. xAI (if `XAI_API_KEY` set) 3. OpenAI (if `OPENAI_API_KEY` set) 4. OpenRouter (if `OPENROUTER_API_KEY` set) 5. Local (if `LOCAL_MODEL_ENABLED=true`) --- ## Model Recommendations ### Best for General Use ```bash OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile ``` - Excellent quality - 128K context window - Fast (500+ tokens/s) ### Fastest Responses ```bash OPENCLAW_DEFAULT_MODEL=groq/llama-3.1-8b-instant ``` - Instant responses - Good for simple Q&A - Highest rate limits ### Latest & Greatest ```bash OPENCLAW_DEFAULT_MODEL=meta-llama/llama-4-maverick-17b-128e-instruct ``` - Llama 4 architecture - Best reasoning - cutting-edge performance ### Long Documents ```bash OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile ``` - 128K context window - Can process entire books - Excellent summarization --- ## Pricing ### Free Tier (Generous!) | Model | Rate Limit | |-------|-----------| | Llama 3.1 8B | ~30 req/min | | Llama 3.3 70B | ~30 req/min | | Llama 4 Maverick | ~30 req/min | | Llama 4 Scout | ~30 req/min | | Qwen3 32B | ~30 req/min | | Kimi K2 | ~30 req/min | **Perfect for personal bots!** Most users never need paid tier. ### Paid Plans Check https://groq.com/pricing for enterprise pricing. --- ## Performance Comparison | Provider | Tokens/sec | Latency | Cost | |----------|-----------|---------|------| | **Groq Llama 3.3** | 500+ | <100ms | Free | | Groq Llama 4 | 400+ | <150ms | Free | | xAI Grok | 100-200 | 200-500ms | $ | | OpenAI GPT-4 | 50-100 | 500ms-1s | $$$ | | Local Ollama | 20-50 | 100-200ms | Free | --- ## Troubleshooting ### "Invalid API key" 1. Verify key starts with `gsk_` 2. No spaces or newlines 3. Check key at https://console.groq.com/api-keys 4. **Regenerate if compromised** ### "Rate limit exceeded" - Free tier: ~30 requests/minute - Use `llama-3.1-8b-instant` for higher limits - Add delays between requests - Consider paid plan for heavy usage ### "Model not found" - Use exact model ID from table above - Check model is active in Groq console - Some models may be region-restricted ### Slow Responses - Groq should be <100ms - Check internet connection - HF Spaces region matters (US = fastest) --- ## Example: WhatsApp Bot with Groq ```bash # HF Spaces secrets GROQ_API_KEY=gsk_xxxxx HF_TOKEN=hf_xxxxx AUTO_CREATE_DATASET=true # WhatsApp (configure in Control UI) WHATSAPP_PHONE=+1234567890 WHATSAPP_CODE=ABC123 ``` Result: **Ultra-fast** WhatsApp AI bot! ⚡ --- ## API Reference ### Test Your Key ```bash curl https://api.groq.com/openai/v1/models \ -H "Authorization: Bearer gsk_xxxxx" ``` ### Chat Completion ```bash curl https://api.groq.com/openai/v1/chat/completions \ -H "Content-Type: application/json" \ -H "Authorization: Bearer gsk_xxxxx" \ -d '{ "model": "llama-3.3-70b-versatile", "messages": [ {"role": "user", "content": "Hello!"} ] }' ``` --- ## Best Practices ### 1. Choose Right Model - **Chat**: `llama-3.3-70b-versatile` - **Fast Q&A**: `llama-3.1-8b-instant` - **Complex tasks**: `meta-llama/llama-4-maverick-17b-128e-instruct` - **Long docs**: `llama-3.3-70b-versatile` (128K context) ### 2. Monitor Usage Check https://console.groq.com/usage ### 3. Secure Your Key - Never commit to git - Use HF Spaces secrets - Rotate keys periodically ### 4. Set Up Alerts Configure usage alerts in Groq console. --- ## Next Steps 1. ✅ **Get API key** from https://console.groq.com 2. ✅ **Set `GROQ_API_KEY`** in HF Spaces secrets 3. ✅ **Deploy** and test in Control UI 4. ✅ **Configure** WhatsApp/Telegram channels 5. 🎉 Enjoy **sub-second** AI responses! --- ## Speed Test After setup, test Groq's speed: ``` 1. Open Control UI 2. Select "Llama 3.3 70B (Versatile)" 3. Send: "Write a 100-word story about a robot" 4. Watch it generate in <0.5 seconds! ⚡⚡⚡ ``` --- ## Support - **Groq Docs**: https://console.groq.com/docs - **API Status**: https://status.groq.com - **HuggingClaw**: https://github.com/openclaw/openclaw/issues --- ## Available via OpenAI-Compatible API All Groq models work via OpenAI-compatible endpoint: ```bash OPENAI_API_KEY=gsk_xxxxx OPENAI_BASE_URL=https://api.groq.com/openai/v1 ``` This allows using Groq with any OpenAI-compatible client!