Spaces:

proti0070
/

try

Sleeping

App Files Files Community

try / scripts /GROQ_SETUP.md

proti0070

Upload folder using huggingface_hub

7dbb323 verified 18 days ago

preview code

raw

history blame contribute delete

7.42 kB

Groq API Setup Guide for HuggingClaw

⚡ Why Groq?

Groq is the FASTEST inference engine available - up to 500+ tokens/second!

Feature	Groq	Others
Speed	⚡⚡⚡⚡⚡ 500+ t/s	⚡⚡ 50-100 t/s
Latency	<100ms	500ms-2s
Free Tier	✅ Yes, generous	⚠️ Limited
Models	Llama 3/4, Qwen, Kimi, GPT-OSS	Varies

⚠️ SECURITY WARNING

Never share your API key publicly! If you've shared it:

Go to https://console.groq.com/api-keys
Delete the compromised key
Create a new one
Store it securely (password manager, HF Spaces secrets)

Quick Start

Step 1: Get Your Groq API Key

Go to https://console.groq.com
Sign in or create account (free)
Navigate to API Keys in left sidebar
Click Create API Key
Copy your key (starts with gsk_...)
Keep it secret!

Step 2: Configure HuggingFace Spaces

In your Space Settings → Repository secrets, add:

GROQ_API_KEY=gsk_your-actual-api-key-here
OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile

Step 3: Deploy

Push changes or redeploy the Space. Groq will be automatically configured.

Step 4: Use

Open Space URL
Enter gateway token (default: huggingclaw)
Select "Llama 3.3 70B (Versatile)" from model dropdown
Experience blazing fast responses! ⚡

Available Models (Verified 2025)

Chat Models

Model ID	Name	Context	Speed	Best For
`llama-3.3-70b-versatile`	Llama 3.3 70B	128K	⚡⚡⚡⚡	Best overall
`llama-3.1-8b-instant`	Llama 3.1 8B	128K	⚡⚡⚡⚡⚡	Ultra-fast
`meta-llama/llama-4-maverick-17b-128e-instruct`	Llama 4 Maverick	128K	⚡⚡⚡⚡	Latest Llama 4
`meta-llama/llama-4-scout-17b-16e-instruct`	Llama 4 Scout	128K	⚡⚡⚡⚡	Latest Llama 4
`qwen/qwen3-32b`	Qwen3 32B	128K	⚡⚡⚡	Alibaba model
`moonshotai/kimi-k2-instruct`	Kimi K2	128K	⚡⚡⚡	Moonshot AI
`openai/gpt-oss-20b`	GPT-OSS 20B	128K	⚡⚡⚡	OpenAI open-source
`allam-2-7b`	Allam-2 7B	4K	⚡⚡⚡⚡	Arabic/English

Audio Models

Model ID	Name	Purpose
`whisper-large-v3-turbo`	Whisper Large V3 Turbo	Speech-to-text
`whisper-large-v3`	Whisper Large V3	Speech-to-text

Safety Models

Model ID	Name	Purpose
`meta-llama/llama-guard-4-12b`	Llama Guard 4	Content moderation
`meta-llama/llama-prompt-guard-2-86m`	Llama Prompt Guard 2	Prompt injection detection

Configuration Options

Basic Setup (Recommended)

GROQ_API_KEY=gsk_xxxxx
OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile

Multiple Providers

Use Groq as primary with fallbacks:

# Groq (primary - fastest)
GROQ_API_KEY=gsk_xxxxx

# OpenRouter (fallback - more models)
OPENROUTER_API_KEY=sk-or-v1-xxxxx

# Local Ollama (free backup)
LOCAL_MODEL_ENABLED=true
LOCAL_MODEL_NAME=neuralnexuslab/hacking

Priority order:

Groq (if GROQ_API_KEY set) ← Fastest!
xAI (if XAI_API_KEY set)
OpenAI (if OPENAI_API_KEY set)
OpenRouter (if OPENROUTER_API_KEY set)
Local (if LOCAL_MODEL_ENABLED=true)

Model Recommendations

Best for General Use

OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile

Excellent quality
128K context window
Fast (500+ tokens/s)

Fastest Responses

OPENCLAW_DEFAULT_MODEL=groq/llama-3.1-8b-instant

Instant responses
Good for simple Q&A
Highest rate limits

Latest & Greatest

OPENCLAW_DEFAULT_MODEL=meta-llama/llama-4-maverick-17b-128e-instruct

Llama 4 architecture
Best reasoning
cutting-edge performance

Long Documents

OPENCLAW_DEFAULT_MODEL=groq/llama-3.3-70b-versatile

128K context window
Can process entire books
Excellent summarization

Pricing

Free Tier (Generous!)

Model	Rate Limit
Llama 3.1 8B	~30 req/min
Llama 3.3 70B	~30 req/min
Llama 4 Maverick	~30 req/min
Llama 4 Scout	~30 req/min
Qwen3 32B	~30 req/min
Kimi K2	~30 req/min

Perfect for personal bots! Most users never need paid tier.

Paid Plans

Check https://groq.com/pricing for enterprise pricing.

Performance Comparison

Provider	Tokens/sec	Latency	Cost
Groq Llama 3.3	500+	<100ms	Free
Groq Llama 4	400+	<150ms	Free
xAI Grok	100-200	200-500ms	$
OpenAI GPT-4	50-100	500ms-1s	$$$
Local Ollama	20-50	100-200ms	Free

Troubleshooting

"Invalid API key"

Verify key starts with gsk_
No spaces or newlines
Check key at https://console.groq.com/api-keys
Regenerate if compromised

"Rate limit exceeded"

Free tier: ~30 requests/minute
Use llama-3.1-8b-instant for higher limits
Add delays between requests
Consider paid plan for heavy usage

"Model not found"

Use exact model ID from table above
Check model is active in Groq console
Some models may be region-restricted

Slow Responses

Groq should be <100ms
Check internet connection
HF Spaces region matters (US = fastest)

Example: WhatsApp Bot with Groq

# HF Spaces secrets
GROQ_API_KEY=gsk_xxxxx
HF_TOKEN=hf_xxxxx
AUTO_CREATE_DATASET=true

# WhatsApp (configure in Control UI)
WHATSAPP_PHONE=+1234567890
WHATSAPP_CODE=ABC123

Result: Ultra-fast WhatsApp AI bot! ⚡

API Reference

Test Your Key

curl https://api.groq.com/openai/v1/models \
  -H "Authorization: Bearer gsk_xxxxx"

Chat Completion

curl https://api.groq.com/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer gsk_xxxxx" \
  -d '{
    "model": "llama-3.3-70b-versatile",
    "messages": [
      {"role": "user", "content": "Hello!"}
    ]
  }'

Best Practices

1. Choose Right Model

Chat: llama-3.3-70b-versatile
Fast Q&A: llama-3.1-8b-instant
Complex tasks: meta-llama/llama-4-maverick-17b-128e-instruct
Long docs: llama-3.3-70b-versatile (128K context)

2. Monitor Usage

Check https://console.groq.com/usage

3. Secure Your Key

Never commit to git
Use HF Spaces secrets
Rotate keys periodically

4. Set Up Alerts

Configure usage alerts in Groq console.

Next Steps

✅ Get API key from https://console.groq.com
✅ Set GROQ_API_KEY in HF Spaces secrets
✅ Deploy and test in Control UI
✅ Configure WhatsApp/Telegram channels
🎉 Enjoy sub-second AI responses!

Speed Test

After setup, test Groq's speed:

1. Open Control UI
2. Select "Llama 3.3 70B (Versatile)"
3. Send: "Write a 100-word story about a robot"
4. Watch it generate in <0.5 seconds! ⚡⚡⚡

Support

Groq Docs: https://console.groq.com/docs
API Status: https://status.groq.com
HuggingClaw: https://github.com/openclaw/openclaw/issues

Available via OpenAI-Compatible API

All Groq models work via OpenAI-compatible endpoint:

OPENAI_API_KEY=gsk_xxxxx
OPENAI_BASE_URL=https://api.groq.com/openai/v1

This allows using Groq with any OpenAI-compatible client!