--- title: Claude Code Proxy emoji: 🤖 colorFrom: blue colorTo: purple sdk: docker sdk_version: "3.14" python_version: "3.14" app_file: server.py pinned: false ---
# 🤖 Free Claude Code **Use Claude Code with free NVIDIA NIM models through a lightweight proxy.** [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](https://opensource.org/licenses/MIT) [![Python 3.14](https://img.shields.io/badge/python-3.14-3776ab.svg?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/downloads/) [![uv](https://img.shields.io/badge/uv-spawn-ffc21c.svg?style=for-the-badge)](https://github.com/astral-sh/uv) [![Code style: Ruff](https://img.shields.io/badge/code%20formatting-ruff-f5a623.svg?style=for-the-badge)](https://github.com/astral-sh/ruff)
## The Problem Claude Code costs $100+/month for API access. This project lets you run it using **free NVIDIA NIM models** instead. ## The Solution A FastAPI proxy that translates Claude Code's Anthropic API calls to NVIDIA NIM's OpenAI-compatible endpoint. Zero code changes needed in Claude Code. ``` ┌─────────────────┐ Anthropic API ┌──────────────────┐ │ Claude Code │ ──────────────────────▶ │ Free Claude │ │ (Official) │ │ Code │ │ │ ◀──────────────────────── │ Proxy │ └─────────────────┘ SSE Streaming │ (:8082) │ └────────┬─────────┘ │ OpenAI Chat API │ ▼ ┌──────────────────┐ │ NVIDIA NIM │ │ (Free Models) │ └──────────────────┘ ``` ## Features - **Drop-in replacement** for Claude Code's Anthropic API - **7 free NVIDIA NIM models** available via auto-routing - **Automatic failover** - switches to next model if one hits rate limit - **Multi-model support** - use different models for different tasks - **Local optimizations** - fast-path for common probes (saves API calls) - **Streaming** - real-time response with SSE - **Tool support** - Claude Code tools work with NIM models - **Thinking blocks** - reasoning support where models support it - **Discord/Telegram bots** - remote Claude Code sessions - **Voice notes** - transcribe voice messages with Whisper ## Quick Start (Cloud - No Setup) The easiest way to use this project is on **HuggingFace Spaces** (free tier available). ### 1. Deploy to HuggingFace Spaces Deploy to HuggingFace Spaces Or manually: 1. Go to [huggingface.co/spaces/Yash030/claude-code-proxy](https://huggingface.co/spaces/Yash030/claude-code-proxy) 2. Duplicate the space 3. Set your secrets in the Space settings: - `NVIDIA_NIM_API_KEY` - Your NVIDIA API key - `ANTHROPIC_AUTH_TOKEN` - Your auth token (any secret) ### 2. Get NVIDIA API Key Get a free key at [build.nvidia.com/settings/api-keys](https://build.nvidia.com/settings/api-keys). ### 3. Connect Claude Code ```bash # Use your HuggingFace Space URL (ends with .hf.space) export ANTHROPIC_AUTH_TOKEN="your-secret-token" export ANTHROPIC_BASE_URL="https://your-space-name.hf.space" claude ``` That's it! Claude Code will use free NVIDIA NIM models. ## Quick Start (Local) ### 1. Install Requirements ```bash # Install Claude Code curl -LsSf https://download.anthropic.com/install.sh | sh # Install uv (fast Python package manager) curl -LsSf https://astral.sh/uv/install.sh | sh uv python install 3.14 ``` ### 2. Clone and Configure ```bash git clone https://github.com/Yashwant00CR7/claude-code-nvidia.git cd claude-code-nvidia cp .env.example .env ``` Edit `.env`: ```dotenv NVIDIA_NIM_API_KEY="nvapi-your-key" ANTHROPIC_AUTH_TOKEN="freecc" MODEL="nvidia_nim/z-ai/glm4.7" ``` ### 3. Start Proxy ```bash uv sync uv run uvicorn server:app --host 0.0.0.0 --port 8082 ``` ### 4. Run Claude Code ```bash export ANTHROPIC_AUTH_TOKEN="freecc" export ANTHROPIC_BASE_URL="http://localhost:8082" claude ``` ## Available Models The proxy automatically routes to these models in order: | Model | Best For | Speed | |-------|----------|-------| | `qwen3-coder-480b` | Code generation | Fast | | `glm4.7` | General purpose | Fast | | `step-3.5-flash` | Fast responses | Very Fast | | `mistral-large-3` | Reasoning | Medium | | `dracarys-llama-3.1-70b` | Complex tasks | Medium | | `seed-oss-36b` | Balanced | Fast | | `mistral-nemotron` | Thinking tasks | Medium | ## How Auto-Routing Works When you use `auto` model, the proxy: 1. **Tries models in order** of speed/reliability 2. **Skips rate-limited models** - pre-flight check before each request 3. **Fast failover** - if one model times out, immediately tries next 4. **No API waste** - common probes handled locally ``` Request: "Write a function" ↓ Check if model 1 is rate-limited? → Yes → Skip Check if model 2 is rate-limited? → No → Try ↓ Model 2 responds? → Yes → Stream response Model 2 timeout? → Try model 3 → Success! ``` ## Environment Variables ### Required ```dotenv NVIDIA_NIM_API_KEY="nvapi-your-key" # From build.nvidia.com ANTHROPIC_AUTH_TOKEN="your-secret" # Any secret you choose ``` ### Optional ```dotenv MODEL="nvidia_nim/z-ai/glm4.7" # Default model MODEL_OPUS="nvidia_nim/qwen/qwen3-..." # Model for Opus requests MODEL_SONNET="nvidia_nim/z-ai/glm4.7" # Model for Sonnet requests MODEL_HAIKU="nvidia_nim/z-ai/glm4.7" # Model for Haiku requests # Auto-routing order (comma-separated) AUTO_MODEL_PRIORITY="nvidia_nim/qwen/...,nvidia_nim/z-ai/..." # Thinking support ENABLE_MODEL_THINKING=true # Enable reasoning blocks ``` ## IDE Integration ### VS Code Extension Add to `.vscode/settings.json`: ```json { "claudeCode.environmentVariables": [ { "name": "ANTHROPIC_BASE_URL", "value": "http://localhost:8082" }, { "name": "ANTHROPIC_AUTH_TOKEN", "value": "freecc" } ] } ``` ### JetBrains ACP Edit `~/.jetbrains/acp.json`: ```json { "env": { "ANTHROPIC_BASE_URL": "http://localhost:8082", "ANTHROPIC_AUTH_TOKEN": "freecc" } } ``` ### Remote/Ssh For remote development, deploy to HuggingFace Spaces and use: ```bash export ANTHROPIC_BASE_URL="https://your-space.hf.space" ``` ## Deployment Options ### HuggingFace Spaces (Recommended for Cloud) **Free tier includes:** - 2 vCPU - Community support - Automatic HTTPS - Git-based deployment **Setup:** 1. Fork [the space](https://huggingface.co/spaces/Yash030/claude-code-proxy) 2. Add `NVIDIA_NIM_API_KEY` to Space secrets 3. Access at `https://your-space.hf.space` ### Railway (Easy Deploy) 1. Connect GitHub repo 2. Set environment variables 3. Deploy with auto-scaling ### Render (Free Tier) 1. Create Web Service 2. Connect GitHub 3. Set build command: `uv sync` 4. Set start command: `uv run uvicorn server:app --host 0.0.0.0 --port $PORT` ### Fly.io (Global Edge) ```bash fly launch fly secrets set NVIDIA_NIM_API_KEY="nvapi-..." fly deploy ``` ### Local/Docker ```bash docker build -t free-claude-code . docker run -p 8082:8082 \ -e NVIDIA_NIM_API_KEY="nvapi-..." \ -e ANTHROPIC_AUTH_TOKEN="freecc" \ free-claude-code ``` ## Architecture ``` api/ ├── routes.py # FastAPI endpoints ├── services.py # Request handling & failover ├── model_router.py # Model resolution ├── detection.py # Request type detection └── optimization_handlers.py # Fast-path responses core/ ├── anthropic/ # SSE, token counting, tool parsing └── task_detector.py # Task capability detection providers/ ├── openai_compat.py # Base OpenAI transport ├── nvidia_nim/ # NVIDIA NIM provider └── rate_limit.py # Rate limiting messaging/ ├── discord.py # Discord bot wrapper └── telegram.py # Telegram bot wrapper ``` ## Troubleshooting ### "undefined ... input_tokens" error - Update to latest version: `git pull` - Check `ANTHROPIC_BASE_URL` doesn't end with `/v1` ### Provider disconnects during streaming - Reduce `PROVIDER_MAX_CONCURRENCY` - Increase `HTTP_READ_TIMEOUT` - Check NVIDIA NIM status at [status.nvidia.com](https://status.nvidia.com) ### Model not responding - Check your NVIDIA API key is valid - Verify rate limits haven't been hit - Try a different model ### VS Code extension shows login - Reload the extension after setting env vars - Confirm environment variables are set correctly ## Contributing 1. Fork the repo 2. Create a feature branch 3. Run checks: `uv run ruff format && uv run ruff check && uv run ty check` 4. Submit PR ## License MIT License - See [LICENSE](LICENSE) ## Links - [GitHub](https://github.com/Yashwant00CR7/claude-code-nvidia) - [HuggingFace Space](https://huggingface.co/spaces/Yash030/claude-code-proxy) - [NVIDIA NIM](https://build.nvidia.com) - [Claude Code](https://github.com/anthropics/claude-code)