Spaces:

proti0070
/

try

Sleeping

File size: 7,419 Bytes

7dbb323

# Groq API Setup Guide for HuggingClaw

## ⚡ Why Groq?

**Groq is the FASTEST inference engine available** - up to 500+ tokens/second!

| Feature | Groq | Others |
|---------|------|--------|
| Speed | ⚡⚡⚡⚡⚡ 500+ t/s | ⚡⚡ 50-100 t/s |
| Latency | <100ms | 500ms-2s |
| Free Tier | ✅ Yes, generous | ⚠️ Limited |
| Models | Llama 3/4, Qwen, Kimi, GPT-OSS | Varies |

---

## ⚠️ SECURITY WARNING

**Never share your API key publicly!** If you've shared it:

1. Go to https://console.groq.com/api-keys
2. Delete the compromised key
3. Create a new one
4. Store it securely (password manager, HF Spaces secrets)

---

## Quick Start

### Step 1: Get Your Groq API Key

1. Go to **https://console.groq.com**
2. Sign in or create account (free)
3. Navigate to **API Keys** in left sidebar
4. Click **Create API Key**
5. Copy your key (starts with `gsk_...`)
6. **Keep it secret!**

### Step 2: Configure HuggingFace Spaces

In your Space **Settings → Repository secrets**, add:

```bash
GROQ_API_KEY=gsk_your-actual-api-key-here
OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile
```

### Step 3: Deploy

Push changes or redeploy the Space. Groq will be automatically configured.

### Step 4: Use

1. Open Space URL
2. Enter gateway token (default: `huggingclaw`)
3. Select "Llama 3.3 70B (Versatile)" from model dropdown
4. Experience blazing fast responses! ⚡

---

## Available Models (Verified 2025)

### Chat Models

| Model ID | Name | Context | Speed | Best For |
|----------|------|---------|-------|----------|
| `llama-3.3-70b-versatile` | Llama 3.3 70B | 128K | ⚡⚡⚡⚡ | **Best overall** |
| `llama-3.1-8b-instant` | Llama 3.1 8B | 128K | ⚡⚡⚡⚡⚡ | Ultra-fast |
| `meta-llama/llama-4-maverick-17b-128e-instruct` | Llama 4 Maverick | 128K | ⚡⚡⚡⚡ | Latest Llama 4 |
| `meta-llama/llama-4-scout-17b-16e-instruct` | Llama 4 Scout | 128K | ⚡⚡⚡⚡ | Latest Llama 4 |
| `qwen/qwen3-32b` | Qwen3 32B | 128K | ⚡⚡⚡ | Alibaba model |
| `moonshotai/kimi-k2-instruct` | Kimi K2 | 128K | ⚡⚡⚡ | Moonshot AI |
| `openai/gpt-oss-20b` | GPT-OSS 20B | 128K | ⚡⚡⚡ | OpenAI open-source |
| `allam-2-7b` | Allam-2 7B | 4K | ⚡⚡⚡⚡ | Arabic/English |

### Audio Models

| Model ID | Name | Purpose |
|----------|------|---------|
| `whisper-large-v3-turbo` | Whisper Large V3 Turbo | Speech-to-text |
| `whisper-large-v3` | Whisper Large V3 | Speech-to-text |

### Safety Models

| Model ID | Name | Purpose |
|----------|------|---------|
| `meta-llama/llama-guard-4-12b` | Llama Guard 4 | Content moderation |
| `meta-llama/llama-prompt-guard-2-86m` | Llama Prompt Guard 2 | Prompt injection detection |

---

## Configuration Options

### Basic Setup (Recommended)

```bash
GROQ_API_KEY=gsk_xxxxx
OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile
```

### Multiple Providers

Use Groq as primary with fallbacks:

```bash
# Groq (primary - fastest)
GROQ_API_KEY=gsk_xxxxx

# OpenRouter (fallback - more models)
OPENROUTER_API_KEY=sk-or-v1-xxxxx

# Local Ollama (free backup)
LOCAL_MODEL_ENABLED=true
LOCAL_MODEL_NAME=neuralnexuslab/hacking
```

Priority order:
1. **Groq** (if `GROQ_API_KEY` set) ← Fastest!
2. xAI (if `XAI_API_KEY` set)
3. OpenAI (if `OPENAI_API_KEY` set)
4. OpenRouter (if `OPENROUTER_API_KEY` set)
5. Local (if `LOCAL_MODEL_ENABLED=true`)

---

## Model Recommendations

### Best for General Use
```bash
OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile
```
- Excellent quality
- 128K context window
- Fast (500+ tokens/s)

### Fastest Responses
```bash
OPENCLAW_DEFAULT_MODEL=groq/llama-3.1-8b-instant
```
- Instant responses
- Good for simple Q&A
- Highest rate limits

### Latest & Greatest
```bash
OPENCLAW_DEFAULT_MODEL=meta-llama/llama-4-maverick-17b-128e-instruct
```
- Llama 4 architecture
- Best reasoning
- cutting-edge performance

### Long Documents
```bash
OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile
```
- 128K context window
- Can process entire books
- Excellent summarization

---

## Pricing

### Free Tier (Generous!)

| Model | Rate Limit |
|-------|-----------|
| Llama 3.1 8B | ~30 req/min |
| Llama 3.3 70B | ~30 req/min |
| Llama 4 Maverick | ~30 req/min |
| Llama 4 Scout | ~30 req/min |
| Qwen3 32B | ~30 req/min |
| Kimi K2 | ~30 req/min |

**Perfect for personal bots!** Most users never need paid tier.

### Paid Plans

Check https://groq.com/pricing for enterprise pricing.

---

## Performance Comparison

| Provider | Tokens/sec | Latency | Cost |
|----------|-----------|---------|------|
| **Groq Llama 3.3** | 500+ | <100ms | Free |
| Groq Llama 4 | 400+ | <150ms | Free |
| xAI Grok | 100-200 | 200-500ms | $ |
| OpenAI GPT-4 | 50-100 | 500ms-1s | $$$ |
| Local Ollama | 20-50 | 100-200ms | Free |

---

## Troubleshooting

### "Invalid API key"

1. Verify key starts with `gsk_`
2. No spaces or newlines
3. Check key at https://console.groq.com/api-keys
4. **Regenerate if compromised**

### "Rate limit exceeded"

- Free tier: ~30 requests/minute
- Use `llama-3.1-8b-instant` for higher limits
- Add delays between requests
- Consider paid plan for heavy usage

### "Model not found"

- Use exact model ID from table above
- Check model is active in Groq console
- Some models may be region-restricted

### Slow Responses

- Groq should be <100ms
- Check internet connection
- HF Spaces region matters (US = fastest)

---

## Example: WhatsApp Bot with Groq

```bash
# HF Spaces secrets
GROQ_API_KEY=gsk_xxxxx
HF_TOKEN=hf_xxxxx
AUTO_CREATE_DATASET=true

# WhatsApp (configure in Control UI)
WHATSAPP_PHONE=+1234567890
WHATSAPP_CODE=ABC123
```

Result: **Ultra-fast** WhatsApp AI bot! ⚡

---

## API Reference

### Test Your Key

```bash
curl https://api.groq.com/openai/v1/models \
  -H "Authorization: Bearer gsk_xxxxx"
```

### Chat Completion

```bash
curl https://api.groq.com/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer gsk_xxxxx" \
  -d '{
    "model": "llama-3.3-70b-versatile",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'
```

---

## Best Practices

### 1. Choose Right Model

- **Chat**: `llama-3.3-70b-versatile`
- **Fast Q&A**: `llama-3.1-8b-instant`
- **Complex tasks**: `meta-llama/llama-4-maverick-17b-128e-instruct`
- **Long docs**: `llama-3.3-70b-versatile` (128K context)

### 2. Monitor Usage

Check https://console.groq.com/usage

### 3. Secure Your Key

- Never commit to git
- Use HF Spaces secrets
- Rotate keys periodically

### 4. Set Up Alerts

Configure usage alerts in Groq console.

---

## Next Steps

1. ✅ **Get API key** from https://console.groq.com
2. ✅ **Set `GROQ_API_KEY`** in HF Spaces secrets
3. ✅ **Deploy** and test in Control UI
4. ✅ **Configure** WhatsApp/Telegram channels
5. 🎉 Enjoy **sub-second** AI responses!

---

## Speed Test

After setup, test Groq's speed:

```
1. Open Control UI
2. Select "Llama 3.3 70B (Versatile)"
3. Send: "Write a 100-word story about a robot"
4. Watch it generate in <0.5 seconds! ⚡⚡⚡
```

---

## Support

- **Groq Docs**: https://console.groq.com/docs
- **API Status**: https://status.groq.com
- **HuggingClaw**: https://github.com/openclaw/openclaw/issues

---

## Available via OpenAI-Compatible API

All Groq models work via OpenAI-compatible endpoint:

```bash
OPENAI_API_KEY=gsk_xxxxx
OPENAI_BASE_URL=https://api.groq.com/openai/v1
```

This allows using Groq with any OpenAI-compatible client!