| --- |
| title: LLM Proxy |
| emoji: 🔀 |
| colorFrom: blue |
| colorTo: purple |
| sdk: docker |
| app_port: 7860 |
| pinned: false |
| --- |
| |
| # AI Proxy – LLM Gateway |
|
|
| A transparent relay proxy that forwards API requests from **Claude Code** and **Gemini CLI** through environments where upstream APIs are blocked (e.g. corporate firewalls). The proxy performs auth-swap, header hygiene, and supports SSE streaming. |
|
|
| ``` |
| Claude Code → AI Proxy → api.anthropic.com |
| Gemini CLI → AI Proxy → generativelanguage.googleapis.com |
| ``` |
|
|
| ## Features |
|
|
| - **Multi-provider relay** – Anthropic and Google Gemini via a single proxy |
| - **1:1 transparent relay** – no request/response body modification |
| - **SSE streaming** – chunk-by-chunk forwarding, zero buffering |
| - **Auth swap** – client authenticates with a shared token; server injects the real API key |
| - **Header hygiene** – strips hop-by-hop headers, authorization, and client-sent API keys |
| - **Rate limiting** – per-IP, configurable window and max |
| - **Defensive headers** – `X-Content-Type-Options: nosniff`, `X-Frame-Options: DENY` |
| - **Graceful shutdown** – finishes in-flight streams before exiting |
| - **Hugging Face Spaces ready** – Docker configuration pre-set for HF Spaces deployment |
|
|
| ## Quick Start (Local) |
|
|
| ```bash |
| # 1. Clone & install |
| git clone <your-repo-url> && cd ai-proxy |
| npm install |
| |
| # 2. Configure – copy and edit |
| cp .env.example .env |
| # Set PROXY_AUTH_TOKEN and at least one provider key |
| # (ANTHROPIC_API_KEY and/or GEMINI_API_KEY) |
| |
| # 3. Run |
| npm run dev |
| ``` |
|
|
| Health check: `curl http://localhost:7860/health` |
|
|
| ## Environment Variables |
|
|
| | Variable | Required | Default | Description | |
| |---|---|---|---| |
| | `PROXY_AUTH_TOKEN` | Yes | – | Shared secret for client authentication | |
| | `ANTHROPIC_API_KEY` | – | – | Anthropic API key (enables Anthropic relay when set) | |
| | `PORT` | – | `7860` | Server port | |
| | `HOST` | – | `0.0.0.0` | Server bind address | |
| | `LOG_LEVEL` | – | `info` | `trace` \| `debug` \| `info` \| `warn` \| `error` | |
| | `RATE_LIMIT_MAX` | – | `100` | Requests per time window per IP | |
| | `RATE_LIMIT_WINDOW_MS` | – | `60000` | Rate limit window (ms) | |
| | `BODY_LIMIT` | – | `5242880` | Max request body size (bytes, 5 MB) | |
| | `CORS_ORIGIN` | – | *(disabled)* | CORS origin (e.g. `*` or `https://example.com`) | |
| | `ANTHROPIC_BASE_URL` | – | `https://api.anthropic.com` | Upstream Anthropic URL | |
| | `UPSTREAM_TIMEOUT_MS` | – | `300000` | Upstream request timeout (5 min) | |
| | `GEMINI_API_KEY` | – | – | Gemini API key (enables Gemini relay when set) | |
| | `GEMINI_BASE_URL` | – | `https://generativelanguage.googleapis.com` | Upstream Gemini URL | |
|
|
| ## API Endpoints |
|
|
| | Method | Path | Auth | Description | |
| |---|---|---|---| |
| | `GET` | `/health` | No | Health check → `{"status":"ok"}` | |
| | `POST` | `/v1/messages` | Yes | Anthropic chat completions (relayed 1:1) | |
| | `POST` | `/v1/messages/count_tokens` | Yes | Anthropic token counting (relayed 1:1) | |
| | `POST` | `/v1beta/models/{model}:generateContent` | Yes | Gemini content generation (relayed 1:1) | |
| | `POST` | `/v1beta/models/{model}:streamGenerateContent` | Yes | Gemini streaming generation (relayed 1:1) | |
|
|
| All other routes return `404`. Non-POST methods on API routes return `405`. |
|
|
| ## Docker |
|
|
| ### Local (docker compose) |
|
|
| ```bash |
| cp .env.example .env |
| # Edit .env with your keys |
| docker compose up --build |
| ``` |
|
|
| ### Hugging Face Spaces |
|
|
| 1. Create a new Space on [huggingface.co/new-space](https://huggingface.co/new-space): |
| - **SDK**: Docker |
| - **Visibility**: Private (recommended – this handles API keys) |
|
|
| 2. Push this repository to the Space: |
| ```bash |
| git remote add hf https://huggingface.co/spaces/<YOUR_USER>/<SPACE_NAME> |
| git push hf main |
| ``` |
|
|
| 3. Configure **Secrets** in Space Settings → Repository secrets: |
| - `PROXY_AUTH_TOKEN` = your chosen shared secret |
| - `ANTHROPIC_API_KEY` = your Anthropic key *(at least one provider required)* |
| - `GEMINI_API_KEY` = your Gemini API key *(at least one provider required)* |
|
|
| 4. The Space will build and deploy automatically. Your proxy URL will be: |
| ``` |
| https://<YOUR_USER>-<SPACE_NAME>.hf.space |
| ``` |
|
|
| > **Note:** HF Spaces secrets become environment variables at runtime. The Dockerfile already defaults to port 7860 and runs as uid 1000 as required by the platform. |
|
|
| ## Claude Code Client Configuration |
|
|
| ### Option 1: Environment Variables |
|
|
| ```bash |
| export ANTHROPIC_BASE_URL=https://your-server.example.com |
| export ANTHROPIC_AUTH_TOKEN=your-proxy-auth-token |
| claude |
| ``` |
|
|
| ### Option 2: Persistent (settings.json) |
|
|
| ```json |
| // ~/.claude/settings.json |
| { |
| "env": { |
| "ANTHROPIC_BASE_URL": "https://your-server.example.com", |
| "ANTHROPIC_AUTH_TOKEN": "your-proxy-auth-token" |
| } |
| } |
| ``` |
|
|
| ### Option 3: Managed Settings (Enterprise) |
|
|
| ```json |
| // macOS: /Library/Application Support/ClaudeCode/managed-settings.json |
| // Linux: /etc/claude-code/managed-settings.json |
| { |
| "env": { |
| "ANTHROPIC_BASE_URL": "https://your-server.example.com" |
| } |
| } |
| ``` |
|
|
| ## Gemini CLI Client Configuration |
|
|
| Configure Gemini CLI to use the proxy by setting the base URL and API key: |
|
|
| ```bash |
| export GOOGLE_GEMINI_BASE_URL=https://your-server.example.com |
| export GEMINI_API_KEY=your-proxy-auth-token |
| gemini |
| ``` |
|
|
| > **Note:** Use the same `PROXY_AUTH_TOKEN` value as `GEMINI_API_KEY` on the client side. The proxy accepts it via the `x-goog-api-key` header, validates it as the proxy auth token, and replaces it with the real Gemini API key before forwarding upstream. |
|
|
| **Important:** Authenticate Gemini CLI via API key (not Google login). If you have a cached Google session, run `gemini --clear-credentials` first, otherwise the CLI may ignore the base URL override. |
|
|
| ### Test the Connection |
|
|
| ```bash |
| # Health check |
| curl https://your-server.example.com/health |
| |
| # Test Anthropic relay |
| curl -X POST https://your-server.example.com/v1/messages \ |
| -H "Authorization: Bearer your-proxy-auth-token" \ |
| -H "Content-Type: application/json" \ |
| -H "anthropic-version: 2023-06-01" \ |
| -d '{ |
| "model": "claude-sonnet-4-20250514", |
| "max_tokens": 100, |
| "messages": [{"role": "user", "content": "Hi"}] |
| }' |
| |
| # Test Gemini relay |
| curl -X POST https://your-server.example.com/v1beta/models/gemini-2.0-flash:generateContent \ |
| -H "Authorization: Bearer your-proxy-auth-token" \ |
| -H "Content-Type: application/json" \ |
| -d '{ |
| "contents": [{"parts": [{"text": "Hi"}]}] |
| }' |
| ``` |
|
|
| ## Tech Stack |
|
|
| - **Runtime:** Node.js >= 20 |
| - **Framework:** [Fastify](https://fastify.dev/) 5 |
| - **HTTP Client:** [undici](https://undici.nodejs.org/) |
| - **Language:** TypeScript (strict mode) |
|
|
| ## License |
|
|
| MIT |
|
|