--- title: LLM Proxy emoji: 🔀 colorFrom: blue colorTo: purple sdk: docker app_port: 7860 pinned: false --- # AI Proxy – LLM Gateway A transparent relay proxy that forwards API requests from **Claude Code** and **Gemini CLI** through environments where upstream APIs are blocked (e.g. corporate firewalls). The proxy performs auth-swap, header hygiene, and supports SSE streaming. ``` Claude Code → AI Proxy → api.anthropic.com Gemini CLI → AI Proxy → generativelanguage.googleapis.com ``` ## Features - **Multi-provider relay** – Anthropic and Google Gemini via a single proxy - **1:1 transparent relay** – no request/response body modification - **SSE streaming** – chunk-by-chunk forwarding, zero buffering - **Auth swap** – client authenticates with a shared token; server injects the real API key - **Header hygiene** – strips hop-by-hop headers, authorization, and client-sent API keys - **Rate limiting** – per-IP, configurable window and max - **Defensive headers** – `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY` - **Graceful shutdown** – finishes in-flight streams before exiting - **Hugging Face Spaces ready** – Docker configuration pre-set for HF Spaces deployment ## Quick Start (Local) ```bash # 1. Clone & install git clone && cd ai-proxy npm install # 2. Configure – copy and edit cp .env.example .env # Set PROXY_AUTH_TOKEN and at least one provider key # (ANTHROPIC_API_KEY and/or GEMINI_API_KEY) # 3. Run npm run dev ``` Health check: `curl http://localhost:7860/health` ## Environment Variables | Variable | Required | Default | Description | |---|---|---|---| | `PROXY_AUTH_TOKEN` | Yes | – | Shared secret for client authentication | | `ANTHROPIC_API_KEY` | – | – | Anthropic API key (enables Anthropic relay when set) | | `PORT` | – | `7860` | Server port | | `HOST` | – | `0.0.0.0` | Server bind address | | `LOG_LEVEL` | – | `info` | `trace` \| `debug` \| `info` \| `warn` \| `error` | | `RATE_LIMIT_MAX` | – | `100` | Requests per time window per IP | | `RATE_LIMIT_WINDOW_MS` | – | `60000` | Rate limit window (ms) | | `BODY_LIMIT` | – | `5242880` | Max request body size (bytes, 5 MB) | | `CORS_ORIGIN` | – | *(disabled)* | CORS origin (e.g. `*` or `https://example.com`) | | `ANTHROPIC_BASE_URL` | – | `https://api.anthropic.com` | Upstream Anthropic URL | | `UPSTREAM_TIMEOUT_MS` | – | `300000` | Upstream request timeout (5 min) | | `GEMINI_API_KEY` | – | – | Gemini API key (enables Gemini relay when set) | | `GEMINI_BASE_URL` | – | `https://generativelanguage.googleapis.com` | Upstream Gemini URL | ## API Endpoints | Method | Path | Auth | Description | |---|---|---|---| | `GET` | `/health` | No | Health check → `{"status":"ok"}` | | `POST` | `/v1/messages` | Yes | Anthropic chat completions (relayed 1:1) | | `POST` | `/v1/messages/count_tokens` | Yes | Anthropic token counting (relayed 1:1) | | `POST` | `/v1beta/models/{model}:generateContent` | Yes | Gemini content generation (relayed 1:1) | | `POST` | `/v1beta/models/{model}:streamGenerateContent` | Yes | Gemini streaming generation (relayed 1:1) | All other routes return `404`. Non-POST methods on API routes return `405`. ## Docker ### Local (docker compose) ```bash cp .env.example .env # Edit .env with your keys docker compose up --build ``` ### Hugging Face Spaces 1. Create a new Space on [huggingface.co/new-space](https://huggingface.co/new-space): - **SDK**: Docker - **Visibility**: Private (recommended – this handles API keys) 2. Push this repository to the Space: ```bash git remote add hf https://huggingface.co/spaces// git push hf main ``` 3. Configure **Secrets** in Space Settings → Repository secrets: - `PROXY_AUTH_TOKEN` = your chosen shared secret - `ANTHROPIC_API_KEY` = your Anthropic key *(at least one provider required)* - `GEMINI_API_KEY` = your Gemini API key *(at least one provider required)* 4. The Space will build and deploy automatically. Your proxy URL will be: ``` https://-.hf.space ``` > **Note:** HF Spaces secrets become environment variables at runtime. The Dockerfile already defaults to port 7860 and runs as uid 1000 as required by the platform. ## Claude Code Client Configuration ### Option 1: Environment Variables ```bash export ANTHROPIC_BASE_URL=https://your-server.example.com export ANTHROPIC_AUTH_TOKEN=your-proxy-auth-token claude ``` ### Option 2: Persistent (settings.json) ```json // ~/.claude/settings.json { "env": { "ANTHROPIC_BASE_URL": "https://your-server.example.com", "ANTHROPIC_AUTH_TOKEN": "your-proxy-auth-token" } } ``` ### Option 3: Managed Settings (Enterprise) ```json // macOS: /Library/Application Support/ClaudeCode/managed-settings.json // Linux: /etc/claude-code/managed-settings.json { "env": { "ANTHROPIC_BASE_URL": "https://your-server.example.com" } } ``` ## Gemini CLI Client Configuration Configure Gemini CLI to use the proxy by setting the base URL and API key: ```bash export GOOGLE_GEMINI_BASE_URL=https://your-server.example.com export GEMINI_API_KEY=your-proxy-auth-token gemini ``` > **Note:** Use the same `PROXY_AUTH_TOKEN` value as `GEMINI_API_KEY` on the client side. The proxy accepts it via the `x-goog-api-key` header, validates it as the proxy auth token, and replaces it with the real Gemini API key before forwarding upstream. **Important:** Authenticate Gemini CLI via API key (not Google login). If you have a cached Google session, run `gemini --clear-credentials` first, otherwise the CLI may ignore the base URL override. ### Test the Connection ```bash # Health check curl https://your-server.example.com/health # Test Anthropic relay curl -X POST https://your-server.example.com/v1/messages \ -H "Authorization: Bearer your-proxy-auth-token" \ -H "Content-Type: application/json" \ -H "anthropic-version: 2023-06-01" \ -d '{ "model": "claude-sonnet-4-20250514", "max_tokens": 100, "messages": [{"role": "user", "content": "Hi"}] }' # Test Gemini relay curl -X POST https://your-server.example.com/v1beta/models/gemini-2.0-flash:generateContent \ -H "Authorization: Bearer your-proxy-auth-token" \ -H "Content-Type: application/json" \ -d '{ "contents": [{"parts": [{"text": "Hi"}]}] }' ``` ## Tech Stack - **Runtime:** Node.js >= 20 - **Framework:** [Fastify](https://fastify.dev/) 5 - **HTTP Client:** [undici](https://undici.nodejs.org/) - **Language:** TypeScript (strict mode) ## License MIT