Spaces:

Yash030
/

claude-code-proxy

Running

App Files Files Community

Yash030 commited on 19 days ago

Commit

02f434f

1 Parent(s): 0157ac7

Add README.md for build

Browse files

Files changed (1) hide show

README.md +500 -10

README.md CHANGED Viewed

@@ -1,10 +1,500 @@
----
-title: Claude Code Proxy
-emoji: 🐨
-colorFrom: green
-colorTo: green
-sdk: docker
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+<div align="center">
+# 🤖 Free Claude Code
+Use Claude Code CLI, VS Code, JetBrains ACP, or chat bots through your own Anthropic-compatible proxy.
+[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](https://opensource.org/licenses/MIT)
+[![Python 3.14](https://img.shields.io/badge/python-3.14-3776ab.svg?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/downloads/)
+[![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json&style=for-the-badge)](https://github.com/astral-sh/uv)
+[![Tested with Pytest](https://img.shields.io/badge/testing-Pytest-00c0ff.svg?style=for-the-badge)](https://github.com/Alishahryar1/free-claude-code/actions/workflows/tests.yml)
+[![Type checking: Ty](https://img.shields.io/badge/type%20checking-ty-ffcc00.svg?style=for-the-badge)](https://pypi.org/project/ty/)
+[![Code style: Ruff](https://img.shields.io/badge/code%20formatting-ruff-f5a623.svg?style=for-the-badge)](https://github.com/astral-sh/ruff)
+[![Logging: Loguru](https://img.shields.io/badge/logging-loguru-4ecdc4.svg?style=for-the-badge)](https://github.com/Delgan/loguru)
+Free Claude Code routes Anthropic Messages API traffic from Claude Code to NVIDIA NIM. It keeps Claude Code's client-side protocol stable while letting you use NVIDIA's free models.
+[Quick Start](#quick-start) · [Providers](#choose-a-provider) · [Clients](#connect-claude-code) · [Troubleshooting](#troubleshooting) · [Development](#development)
+</div>
+<div align="center">
+  <img src="pic.png" alt="Free Claude Code in action" width="700">
+</div>
+## What You Get
+- Drop-in proxy for Claude Code's Anthropic API calls.
+- NVIDIA NIM provider backend with free models.
+- Per-model routing: send Opus, Sonnet, Haiku, and fallback traffic to different NVIDIA NIM models.
+- Native Claude Code `/model` picker support through the proxy's `/v1/models` endpoint.
+- Streaming, tool use, reasoning/thinking block handling, and local request optimizations.
+- Optional Discord or Telegram bot wrapper for remote coding sessions.
+- Optional voice-note transcription through local Whisper or NVIDIA NIM.
+## Quick Start
+### 1. Install Requirements
+Install [Claude Code](https://github.com/anthropics/claude-code), then install `uv` and Python 3.14.
+macOS/Linux:
+```bash
+curl -LsSf https://astral.sh/uv/install.sh | sh
+uv self update
+uv python install 3.14
+```
+Windows PowerShell:
+```powershell
+powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
+uv self update
+uv python install 3.14
+```
+### 2. Clone And Configure
+```bash
+git clone https://github.com/Alishahryar1/free-claude-code.git
+cd free-claude-code
+cp .env.example .env
+```
+PowerShell uses:
+```powershell
+Copy-Item .env.example .env
+```
+Edit `.env` and choose one provider. For the default NVIDIA NIM path:
+```dotenv
+NVIDIA_NIM_API_KEY="nvapi-your-key"
+MODEL="nvidia_nim/z-ai/glm4.7"
+ANTHROPIC_AUTH_TOKEN="freecc"
+```
+Use any local secret for `ANTHROPIC_AUTH_TOKEN`; Claude Code will send the same value back to this proxy. Leave it empty only for local/private testing.
+### 3. Start The Proxy
+```bash
+uv run uvicorn server:app --host 0.0.0.0 --port 8082
+```
+Package install alternative:
+```bash
+uv tool install git+https://github.com/Alishahryar1/free-claude-code.git
+fcc-init
+free-claude-code
+```
+`fcc-init` creates `~/.config/free-claude-code/.env` from the bundled template.
+### 4. Run Claude Code
+Point `ANTHROPIC_BASE_URL` at the proxy root. Do not append `/v1`.
+PowerShell:
+```powershell
+$env:ANTHROPIC_AUTH_TOKEN="freecc"; $env:ANTHROPIC_BASE_URL="http://localhost:8082"; claude
+```
+Bash:
+```bash
+ANTHROPIC_AUTH_TOKEN="freecc" ANTHROPIC_BASE_URL="http://localhost:8082" claude
+```
+## Choose A Provider
+Model values use this format:
+```text
+provider_id/model/name
+```
+`MODEL` is the fallback. `MODEL_OPUS`, `MODEL_SONNET`, and `MODEL_HAIKU` override routing for requests that Claude Code sends for those tiers.
+| Provider | Prefix | Transport | Key | Default base URL |
+| --- | --- | --- | --- | --- |
+| <img src="https://cdn.simpleicons.org/nvidia/76B900" alt="" width="18" height="18"> NVIDIA NIM | `nvidia_nim/...` | OpenAI chat translation | `NVIDIA_NIM_API_KEY` | `https://integrate.api.nvidia.com/v1` |
+| <img src="https://cdn.simpleicons.org/groq/F55036" alt="" width="18" height="18"> Groq | `groq/...` | OpenAI chat translation | `GROQ_API_KEY` | `https://api.groq.com/openai/v1` |
+| <img src="https://cdn.simpleicons.org/cerebras/313131" alt="" width="18" height="18"> Cerebras | `cerebras/...` | OpenAI chat translation | `CEREBRAS_API_KEY` | `https://api.cerebras.ai/v1` |
+<details>
+<summary><img src="https://cdn.simpleicons.org/nvidia/76B900" alt="" width="18" height="18"> <b>NVIDIA NIM</b></summary>
+Get a key at [build.nvidia.com/settings/api-keys](https://build.nvidia.com/settings/api-keys).
+```dotenv
+NVIDIA_NIM_API_KEY="nvapi-your-key"
+MODEL="nvidia_nim/z-ai/glm4.7"
+```
+Popular examples:
+- `nvidia_nim/qwen/qwen3-coder-480b-a35b-instruct`
+- `nvidia_nim/mistralai/mistral-large-3-675b-instruct-2512`
+- `nvidia_nim/z-ai/glm4.7`
+</details>
+<details>
+<summary><img src="https://cdn.simpleicons.org/groq/F55036" alt="" width="18" height="18"> <b>Groq</b></summary>
+Get a key at [console.groq.com/keys](https://console.groq.com/keys).
+```dotenv
+GROQ_API_KEY="gsk_..."
+MODEL="groq/openai/gpt-oss-120b"
+```
+Popular examples:
+- `groq/openai/gpt-oss-120b` (Best overall for Claude Code)
+- `groq/openai/gpt-oss-20b` (Ultra-low latency)
+- `groq/llama-3.3-70b-versatile`
+</details>
+<details>
+<summary><img src="https://cdn.simpleicons.org/cerebras/313131" alt="" width="18" height="18"> <b>Cerebras</b></summary>
+Get a key at [cloud.cerebras.ai](https://cloud.cerebras.ai/).
+```dotenv
+CEREBRAS_API_KEY="csk_..."
+MODEL="cerebras/gpt-oss-120b"
+```
+Popular examples:
+- `cerebras/gpt-oss-120b` (~3000 tok/s - Fastest reasoning)
+- `cerebras/qwen-3-235b`
+- `cerebras/llama3.1-8b`
+</details>
+## Connect Claude Code
+### Claude Code CLI
+```bash
+ANTHROPIC_AUTH_TOKEN="freecc" ANTHROPIC_BASE_URL="http://localhost:8082" claude
+```
+### VS Code Extension
+Open Settings, search for `claude-code.environmentVariables`, choose **Edit in settings.json**, and add:
+```json
+"claudeCode.environmentVariables": [
+  { "name": "ANTHROPIC_BASE_URL", "value": "http://localhost:8082" },
+  { "name": "ANTHROPIC_AUTH_TOKEN", "value": "freecc" }
+]
+```
+Reload the extension. If the extension shows a login screen, choose the Anthropic Console path once; the local proxy still handles model traffic after the environment variables are active.
+### JetBrains ACP
+Edit the installed Claude ACP config:
+- Windows: `C:\Users\%USERNAME%\AppData\Roaming\JetBrains\acp-agents\installed.json`
+- Linux/macOS: `~/.jetbrains/acp.json`
+Set the environment for `acp.registry.claude-acp`:
+```json
+"env": {
+  "ANTHROPIC_BASE_URL": "http://localhost:8082",
+  "ANTHROPIC_AUTH_TOKEN": "freecc"
+}
+```
+Restart the IDE after changing the file.
+### Model Picker
+Claude Code 2.1.126 or later reads this proxy's `/v1/models` endpoint when `ANTHROPIC_BASE_URL` points at the proxy. Start Claude Code normally, run `/model`, and choose any discovered provider model.
+<div align="center">
+  <img src="cc-model-picker.png" alt="Claude Code model picker showing gateway models" width="700">
+</div>
+The proxy lists models for configured provider keys and referenced local providers. Picker-safe IDs are routed back to the real provider/model automatically, so no `.env` edit or separate launcher script is needed after startup.
+Each provider model also has a `(no thinking)` picker variant. Use it when a model does not support Claude Code thinking or fails with adaptive-thinking requests. It routes to the same upstream model while asking Claude Code to send a non-thinking request.
+## Optional Integrations
+### Discord And Telegram Bots
+The bot wrapper runs Claude Code sessions remotely, streams progress, supports reply-based conversation branches, and can stop or clear tasks.
+Discord minimum config:
+```dotenv
+MESSAGING_PLATFORM="discord"
+DISCORD_BOT_TOKEN="your-discord-bot-token"
+ALLOWED_DISCORD_CHANNELS="123456789"
+CLAUDE_WORKSPACE="./agent_workspace"
+ALLOWED_DIR="C:/Users/yourname/projects"
+```
+Create the bot in the [Discord Developer Portal](https://discord.com/developers/applications), enable Message Content Intent, and invite it with read/send/history permissions.
+Telegram minimum config:
+```dotenv
+MESSAGING_PLATFORM="telegram"
+TELEGRAM_BOT_TOKEN="123456789:ABC..."
+ALLOWED_TELEGRAM_USER_ID="your-user-id"
+CLAUDE_WORKSPACE="./agent_workspace"
+ALLOWED_DIR="C:/Users/yourname/projects"
+```
+Get a token from [@BotFather](https://t.me/BotFather) and your user ID from [@userinfobot](https://t.me/userinfobot).
+Useful commands:
+- `/stop` cancels a task; reply to a task message to stop only that branch.
+- `/clear` resets sessions; reply to clear one branch.
+- `/stats` shows session state.
+### Voice Notes
+Voice notes work on Discord and Telegram. Choose one backend:
+```bash
+uv sync --extra voice_local
+uv sync --extra voice
+uv sync --extra voice --extra voice_local
+```
+```dotenv
+VOICE_NOTE_ENABLED=true
+WHISPER_DEVICE="cpu"          # cpu | cuda | nvidia_nim
+WHISPER_MODEL="base"
+HF_TOKEN=""
+```
+Use `WHISPER_DEVICE="nvidia_nim"` with the `voice` extra and `NVIDIA_NIM_API_KEY` for NVIDIA-hosted transcription.
+## Configuration Reference
+[`.env.example`](.env.example) is the canonical list of variables. The sections below are the ones most users change.
+### Model Routing
+```dotenv
+MODEL="nvidia_nim/z-ai/glm4.7"
+MODEL_OPUS=
+MODEL_SONNET=
+MODEL_HAIKU=
+ENABLE_MODEL_THINKING=true
+ENABLE_OPUS_THINKING=
+ENABLE_SONNET_THINKING=
+ENABLE_HAIKU_THINKING=
+```
+Blank per-tier values inherit the fallback. Blank thinking overrides inherit `ENABLE_MODEL_THINKING`.
+### Provider Keys And URLs
+```dotenv
+NVIDIA_NIM_API_KEY=""
+```
+Proxy settings are per provider:
+```dotenv
+NVIDIA_NIM_PROXY=""
+```
+### Rate Limits And Timeouts
+```dotenv
+PROVIDER_RATE_LIMIT=1
+PROVIDER_RATE_WINDOW=3
+PROVIDER_MAX_CONCURRENCY=5
+HTTP_READ_TIMEOUT=120
+HTTP_WRITE_TIMEOUT=10
+HTTP_CONNECT_TIMEOUT=10
+```
+Use lower limits for free hosted providers; local providers can usually tolerate higher concurrency if the machine can handle it.
+### Security And Diagnostics
+```dotenv
+ANTHROPIC_AUTH_TOKEN=
+LOG_RAW_API_PAYLOADS=false
+LOG_RAW_SSE_EVENTS=false
+LOG_API_ERROR_TRACEBACKS=false
+LOG_RAW_MESSAGING_CONTENT=false
+LOG_RAW_CLI_DIAGNOSTICS=false
+LOG_MESSAGING_ERROR_DETAILS=false
+```
+Raw logging flags can expose prompts, tool arguments, paths, and model output. Keep them off unless you are debugging locally.
+### Local Web Tools
+```dotenv
+ENABLE_WEB_SERVER_TOOLS=true
+WEB_FETCH_ALLOWED_SCHEMES=http,https
+WEB_FETCH_ALLOW_PRIVATE_NETWORKS=false
+```
+These tools perform outbound HTTP from the proxy. Keep private-network access disabled unless you are in a controlled lab environment.
+## Troubleshooting
+### **Major Fixes (May 2026)**
+#### **1. Model Visibility & Caching Issues**
+The Claude CLI often caches model lists, causing local proxy models to disappear.
+- **Fix:** We implemented a "Multi-Model Advertisement" feature. The `MODEL` environment variable now supports a comma-separated list.
+- **Action:** Set `MODEL="model1,model2,model3"` in your `.env`. The proxy will force the CLI to display all of them by registering them as primary models.
+#### **2. The "Amnesia/Thinking" Loop**
+When using `auto` mode, the proxy would sometimes switch models in the middle of a "Thinking" block if it took too long, causing the CLI to repeat the same thought endlessly.
+- **Fix:** Implemented "Sticky Sessions" in `api/services.py`. Once a model yields its first event (including thinking blocks), the proxy commits to that model for the duration of the turn. Fallbacks only occur if the model fails to start entirely.
+#### **3. NVIDIA NIM Fallback Sync**
+Ensured that the `AUTO_MODEL_PRIORITY` and `NVIDIA_NIM_FALLBACK_MODELS` are synchronized to provide maximum coverage.
+### Claude Code says `undefined ... input_tokens`, `$.speed`, or malformed response
+Update to the latest commit first. Older versions could emit invalid usage metadata in streaming responses. Then check:
+- `ANTHROPIC_BASE_URL` is `http://localhost:8082`, not `http://localhost:8082/v1`.
+- The proxy is returning Server-Sent Events for `/v1/messages`.
+- `server.log` contains no upstream 400/500 response before the malformed-response error.
+### Provider disconnects during streaming
+Errors like `incomplete chunked read`, `server disconnected`, or a peer closing the body usually come from the upstream provider or gateway. Reduce concurrency, raise timeouts, or retry later.
+### Tool calls work on one model but not another
+Tool support is model and provider dependent. Some OpenAI-compatible models emit malformed tool-call deltas, omit tool names, or return tool calls as plain text. Try another model or provider before assuming the proxy is broken.
+### The VS Code extension still shows a login screen
+Confirm the extension environment variables are set, then reload the extension or restart VS Code. The browser login flow may still appear once; the local proxy is used when `ANTHROPIC_BASE_URL` is active in the extension process.
+## How It Works
+```text
+Claude Code CLI / IDE
+        |
+        | Anthropic Messages API
+        v
+Free Claude Code proxy (:8082)
+        |
+        | provider-specific request/stream adapter
+        v
+NVIDIA NIM
+```
+Important pieces:
+- FastAPI exposes Anthropic-compatible routes such as `/v1/messages`, `/v1/messages/count_tokens`, and `/v1/models`.
+- Model routing resolves the Claude model name to `MODEL_OPUS`, `MODEL_SONNET`, `MODEL_HAIKU`, or `MODEL`.
+- NVIDIA NIM uses OpenAI chat streaming translated into Anthropic SSE.
+- The proxy normalizes thinking blocks, tool calls, token usage metadata, and provider errors into the shape Claude Code expects.
+- Request optimizations answer trivial Claude Code probes locally to save latency and quota.
+## Development
+### Project Structure
+```text
+free-claude-code/
+├── server.py              # ASGI entry point
+├── api/                   # FastAPI routes, service layer, routing, optimizations
+├── core/                  # Shared Anthropic protocol helpers and SSE utilities
+├── providers/             # Provider transports, registry, rate limiting
+├── messaging/             # Discord/Telegram adapters, sessions, voice
+├── cli/                   # Package entry points and Claude process management
+├── config/                # Settings, provider catalog, logging
+└��─ tests/                 # Unit and contract tests
+```
+### Commands
+```bash
+uv run ruff format
+uv run ruff check
+uv run ty check
+uv run pytest
+```
+Run them in that order before pushing. CI enforces the same checks.
+### Package Scripts
+`pyproject.toml` installs:
+- `free-claude-code`: starts the proxy with configured host and port.
+- `fcc-init`: creates the user config template at `~/.config/free-claude-code/.env`.
+### Extending
+- Add messaging platforms by implementing the `MessagingPlatform` interface in `messaging/`.
+- Extend NVIDIA NIM provider functionality by modifying `providers/nvidia_nim/`.
+## Contributing
+- Report bugs and feature requests in [Issues](https://github.com/Alishahryar1/free-claude-code/issues).
+- Keep changes small and covered by focused tests.
+- Do not open Docker integration PRs.
+- Do not open README change PRs just open an issue for it.
+- Run the full check sequence before opening a pull request.
+- The syntax Except X, Y is brought back in python 3.14 final version (not in 3.14 alpha). Keep in mind before opening PRs.
+## NVIDIA Qwen integration
+You can run a simple NVIDIA Qwen streaming example using the OpenAI-compatible client shipped below.
+- Install the dependency:
+```bash
+pip install -r requirements.txt
+```
+- Set your NVIDIA API key (do NOT commit keys). Example (PowerShell temporary):
+```powershell
+$env:NV_API_KEY = "nvapi-<YOUR_KEY>"
+python nvidia_integration.py "Write a short Python script that prints Hello"
+```
+Persisted (Windows):
+```powershell
+setx NV_API_KEY "nvapi-<YOUR_KEY>"
+# open a new shell to use the persisted variable
+```
+Linux/macOS:
+```bash
+export NV_API_KEY="nvapi-<YOUR_KEY>"
+python nvidia_integration.py "Write a short Python script that prints Hello"
+```
+The example `nvidia_integration.py` streams completions from `https://integrate.api.nvidia.com/v1` using the `qwen/qwen3-coder-480b-a35b-instruct` model. Replace `<YOUR_KEY>` with your actual NVIDIA API key. Never share or commit your API keys.
+## License
+MIT License. See [LICENSE](LICENSE) for details.