title: LLM Proxy
emoji: 🔀
colorFrom: blue
colorTo: purple
sdk: docker
app_port: 7860
pinned: false
AI Proxy – LLM Gateway
A transparent relay proxy that forwards API requests from Claude Code and Gemini CLI through environments where upstream APIs are blocked (e.g. corporate firewalls). The proxy performs auth-swap, header hygiene, and supports SSE streaming.
Claude Code → AI Proxy → api.anthropic.com
Gemini CLI → AI Proxy → generativelanguage.googleapis.com
Features
- Multi-provider relay – Anthropic and Google Gemini via a single proxy
- 1:1 transparent relay – no request/response body modification
- SSE streaming – chunk-by-chunk forwarding, zero buffering
- Auth swap – client authenticates with a shared token; server injects the real API key
- Header hygiene – strips hop-by-hop headers, authorization, and client-sent API keys
- Rate limiting – per-IP, configurable window and max
- Defensive headers –
X-Content-Type-Options: nosniff,X-Frame-Options: DENY - Graceful shutdown – finishes in-flight streams before exiting
- Hugging Face Spaces ready – Docker configuration pre-set for HF Spaces deployment
Quick Start (Local)
# 1. Clone & install
git clone <your-repo-url> && cd ai-proxy
npm install
# 2. Configure – copy and edit
cp .env.example .env
# Set PROXY_AUTH_TOKEN and at least one provider key
# (ANTHROPIC_API_KEY and/or GEMINI_API_KEY)
# 3. Run
npm run dev
Health check: curl http://localhost:7860/health
Environment Variables
| Variable | Required | Default | Description |
|---|---|---|---|
PROXY_AUTH_TOKEN |
Yes | – | Shared secret for client authentication |
ANTHROPIC_API_KEY |
– | – | Anthropic API key (enables Anthropic relay when set) |
PORT |
– | 7860 |
Server port |
HOST |
– | 0.0.0.0 |
Server bind address |
LOG_LEVEL |
– | info |
trace | debug | info | warn | error |
RATE_LIMIT_MAX |
– | 100 |
Requests per time window per IP |
RATE_LIMIT_WINDOW_MS |
– | 60000 |
Rate limit window (ms) |
BODY_LIMIT |
– | 5242880 |
Max request body size (bytes, 5 MB) |
CORS_ORIGIN |
– | (disabled) | CORS origin (e.g. * or https://example.com) |
ANTHROPIC_BASE_URL |
– | https://api.anthropic.com |
Upstream Anthropic URL |
UPSTREAM_TIMEOUT_MS |
– | 300000 |
Upstream request timeout (5 min) |
GEMINI_API_KEY |
– | – | Gemini API key (enables Gemini relay when set) |
GEMINI_BASE_URL |
– | https://generativelanguage.googleapis.com |
Upstream Gemini URL |
API Endpoints
| Method | Path | Auth | Description |
|---|---|---|---|
GET |
/health |
No | Health check → {"status":"ok"} |
POST |
/v1/messages |
Yes | Anthropic chat completions (relayed 1:1) |
POST |
/v1/messages/count_tokens |
Yes | Anthropic token counting (relayed 1:1) |
POST |
/v1beta/models/{model}:generateContent |
Yes | Gemini content generation (relayed 1:1) |
POST |
/v1beta/models/{model}:streamGenerateContent |
Yes | Gemini streaming generation (relayed 1:1) |
All other routes return 404. Non-POST methods on API routes return 405.
Docker
Local (docker compose)
cp .env.example .env
# Edit .env with your keys
docker compose up --build
Hugging Face Spaces
Create a new Space on huggingface.co/new-space:
- SDK: Docker
- Visibility: Private (recommended – this handles API keys)
Push this repository to the Space:
git remote add hf https://huggingface.co/spaces/<YOUR_USER>/<SPACE_NAME> git push hf mainConfigure Secrets in Space Settings → Repository secrets:
PROXY_AUTH_TOKEN= your chosen shared secretANTHROPIC_API_KEY= your Anthropic key (at least one provider required)GEMINI_API_KEY= your Gemini API key (at least one provider required)
The Space will build and deploy automatically. Your proxy URL will be:
https://<YOUR_USER>-<SPACE_NAME>.hf.space
Note: HF Spaces secrets become environment variables at runtime. The Dockerfile already defaults to port 7860 and runs as uid 1000 as required by the platform.
Claude Code Client Configuration
Option 1: Environment Variables
export ANTHROPIC_BASE_URL=https://your-server.example.com
export ANTHROPIC_AUTH_TOKEN=your-proxy-auth-token
claude
Option 2: Persistent (settings.json)
// ~/.claude/settings.json
{
"env": {
"ANTHROPIC_BASE_URL": "https://your-server.example.com",
"ANTHROPIC_AUTH_TOKEN": "your-proxy-auth-token"
}
}
Option 3: Managed Settings (Enterprise)
// macOS: /Library/Application Support/ClaudeCode/managed-settings.json
// Linux: /etc/claude-code/managed-settings.json
{
"env": {
"ANTHROPIC_BASE_URL": "https://your-server.example.com"
}
}
Gemini CLI Client Configuration
Configure Gemini CLI to use the proxy by setting the base URL and API key:
export GOOGLE_GEMINI_BASE_URL=https://your-server.example.com
export GEMINI_API_KEY=your-proxy-auth-token
gemini
Note: Use the same
PROXY_AUTH_TOKENvalue asGEMINI_API_KEYon the client side. The proxy accepts it via thex-goog-api-keyheader, validates it as the proxy auth token, and replaces it with the real Gemini API key before forwarding upstream.
Important: Authenticate Gemini CLI via API key (not Google login). If you have a cached Google session, run gemini --clear-credentials first, otherwise the CLI may ignore the base URL override.
Test the Connection
# Health check
curl https://your-server.example.com/health
# Test Anthropic relay
curl -X POST https://your-server.example.com/v1/messages \
-H "Authorization: Bearer your-proxy-auth-token" \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "claude-sonnet-4-20250514",
"max_tokens": 100,
"messages": [{"role": "user", "content": "Hi"}]
}'
# Test Gemini relay
curl -X POST https://your-server.example.com/v1beta/models/gemini-2.0-flash:generateContent \
-H "Authorization: Bearer your-proxy-auth-token" \
-H "Content-Type: application/json" \
-d '{
"contents": [{"parts": [{"text": "Hi"}]}]
}'
Tech Stack
License
MIT