Yash030 commited on
Commit
02f434f
Β·
1 Parent(s): 0157ac7

Add README.md for build

Browse files
Files changed (1) hide show
  1. README.md +500 -10
README.md CHANGED
@@ -1,10 +1,500 @@
1
- ---
2
- title: Claude Code Proxy
3
- emoji: 🐨
4
- colorFrom: green
5
- colorTo: green
6
- sdk: docker
7
- pinned: false
8
- ---
9
-
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <div align="center">
2
+
3
+ # πŸ€– Free Claude Code
4
+
5
+ Use Claude Code CLI, VS Code, JetBrains ACP, or chat bots through your own Anthropic-compatible proxy.
6
+
7
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg?style=for-the-badge)](https://opensource.org/licenses/MIT)
8
+ [![Python 3.14](https://img.shields.io/badge/python-3.14-3776ab.svg?style=for-the-badge&logo=python&logoColor=white)](https://www.python.org/downloads/)
9
+ [![uv](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/uv/main/assets/badge/v0.json&style=for-the-badge)](https://github.com/astral-sh/uv)
10
+ [![Tested with Pytest](https://img.shields.io/badge/testing-Pytest-00c0ff.svg?style=for-the-badge)](https://github.com/Alishahryar1/free-claude-code/actions/workflows/tests.yml)
11
+ [![Type checking: Ty](https://img.shields.io/badge/type%20checking-ty-ffcc00.svg?style=for-the-badge)](https://pypi.org/project/ty/)
12
+ [![Code style: Ruff](https://img.shields.io/badge/code%20formatting-ruff-f5a623.svg?style=for-the-badge)](https://github.com/astral-sh/ruff)
13
+ [![Logging: Loguru](https://img.shields.io/badge/logging-loguru-4ecdc4.svg?style=for-the-badge)](https://github.com/Delgan/loguru)
14
+
15
+ Free Claude Code routes Anthropic Messages API traffic from Claude Code to NVIDIA NIM. It keeps Claude Code's client-side protocol stable while letting you use NVIDIA's free models.
16
+
17
+ [Quick Start](#quick-start) Β· [Providers](#choose-a-provider) Β· [Clients](#connect-claude-code) Β· [Troubleshooting](#troubleshooting) Β· [Development](#development)
18
+
19
+ </div>
20
+
21
+ <div align="center">
22
+ <img src="pic.png" alt="Free Claude Code in action" width="700">
23
+ </div>
24
+
25
+ ## What You Get
26
+
27
+ - Drop-in proxy for Claude Code's Anthropic API calls.
28
+ - NVIDIA NIM provider backend with free models.
29
+ - Per-model routing: send Opus, Sonnet, Haiku, and fallback traffic to different NVIDIA NIM models.
30
+ - Native Claude Code `/model` picker support through the proxy's `/v1/models` endpoint.
31
+ - Streaming, tool use, reasoning/thinking block handling, and local request optimizations.
32
+ - Optional Discord or Telegram bot wrapper for remote coding sessions.
33
+ - Optional voice-note transcription through local Whisper or NVIDIA NIM.
34
+
35
+ ## Quick Start
36
+
37
+ ### 1. Install Requirements
38
+
39
+ Install [Claude Code](https://github.com/anthropics/claude-code), then install `uv` and Python 3.14.
40
+
41
+ macOS/Linux:
42
+
43
+ ```bash
44
+ curl -LsSf https://astral.sh/uv/install.sh | sh
45
+ uv self update
46
+ uv python install 3.14
47
+ ```
48
+
49
+ Windows PowerShell:
50
+
51
+ ```powershell
52
+ powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
53
+ uv self update
54
+ uv python install 3.14
55
+ ```
56
+
57
+ ### 2. Clone And Configure
58
+
59
+ ```bash
60
+ git clone https://github.com/Alishahryar1/free-claude-code.git
61
+ cd free-claude-code
62
+ cp .env.example .env
63
+ ```
64
+
65
+ PowerShell uses:
66
+
67
+ ```powershell
68
+ Copy-Item .env.example .env
69
+ ```
70
+
71
+ Edit `.env` and choose one provider. For the default NVIDIA NIM path:
72
+
73
+ ```dotenv
74
+ NVIDIA_NIM_API_KEY="nvapi-your-key"
75
+ MODEL="nvidia_nim/z-ai/glm4.7"
76
+ ANTHROPIC_AUTH_TOKEN="freecc"
77
+ ```
78
+
79
+ Use any local secret for `ANTHROPIC_AUTH_TOKEN`; Claude Code will send the same value back to this proxy. Leave it empty only for local/private testing.
80
+
81
+ ### 3. Start The Proxy
82
+
83
+ ```bash
84
+ uv run uvicorn server:app --host 0.0.0.0 --port 8082
85
+ ```
86
+
87
+ Package install alternative:
88
+
89
+ ```bash
90
+ uv tool install git+https://github.com/Alishahryar1/free-claude-code.git
91
+ fcc-init
92
+ free-claude-code
93
+ ```
94
+
95
+ `fcc-init` creates `~/.config/free-claude-code/.env` from the bundled template.
96
+
97
+ ### 4. Run Claude Code
98
+
99
+ Point `ANTHROPIC_BASE_URL` at the proxy root. Do not append `/v1`.
100
+
101
+ PowerShell:
102
+
103
+ ```powershell
104
+ $env:ANTHROPIC_AUTH_TOKEN="freecc"; $env:ANTHROPIC_BASE_URL="http://localhost:8082"; claude
105
+ ```
106
+
107
+ Bash:
108
+
109
+ ```bash
110
+ ANTHROPIC_AUTH_TOKEN="freecc" ANTHROPIC_BASE_URL="http://localhost:8082" claude
111
+ ```
112
+
113
+ ## Choose A Provider
114
+
115
+ Model values use this format:
116
+
117
+ ```text
118
+ provider_id/model/name
119
+ ```
120
+
121
+ `MODEL` is the fallback. `MODEL_OPUS`, `MODEL_SONNET`, and `MODEL_HAIKU` override routing for requests that Claude Code sends for those tiers.
122
+
123
+ | Provider | Prefix | Transport | Key | Default base URL |
124
+ | --- | --- | --- | --- | --- |
125
+ | <img src="https://cdn.simpleicons.org/nvidia/76B900" alt="" width="18" height="18"> NVIDIA NIM | `nvidia_nim/...` | OpenAI chat translation | `NVIDIA_NIM_API_KEY` | `https://integrate.api.nvidia.com/v1` |
126
+ | <img src="https://cdn.simpleicons.org/groq/F55036" alt="" width="18" height="18"> Groq | `groq/...` | OpenAI chat translation | `GROQ_API_KEY` | `https://api.groq.com/openai/v1` |
127
+ | <img src="https://cdn.simpleicons.org/cerebras/313131" alt="" width="18" height="18"> Cerebras | `cerebras/...` | OpenAI chat translation | `CEREBRAS_API_KEY` | `https://api.cerebras.ai/v1` |
128
+
129
+ <details>
130
+ <summary><img src="https://cdn.simpleicons.org/nvidia/76B900" alt="" width="18" height="18"> <b>NVIDIA NIM</b></summary>
131
+
132
+ Get a key at [build.nvidia.com/settings/api-keys](https://build.nvidia.com/settings/api-keys).
133
+
134
+ ```dotenv
135
+ NVIDIA_NIM_API_KEY="nvapi-your-key"
136
+ MODEL="nvidia_nim/z-ai/glm4.7"
137
+ ```
138
+
139
+ Popular examples:
140
+
141
+ - `nvidia_nim/qwen/qwen3-coder-480b-a35b-instruct`
142
+ - `nvidia_nim/mistralai/mistral-large-3-675b-instruct-2512`
143
+ - `nvidia_nim/z-ai/glm4.7`
144
+
145
+ </details>
146
+
147
+ <details>
148
+ <summary><img src="https://cdn.simpleicons.org/groq/F55036" alt="" width="18" height="18"> <b>Groq</b></summary>
149
+
150
+ Get a key at [console.groq.com/keys](https://console.groq.com/keys).
151
+
152
+ ```dotenv
153
+ GROQ_API_KEY="gsk_..."
154
+ MODEL="groq/openai/gpt-oss-120b"
155
+ ```
156
+
157
+ Popular examples:
158
+
159
+ - `groq/openai/gpt-oss-120b` (Best overall for Claude Code)
160
+ - `groq/openai/gpt-oss-20b` (Ultra-low latency)
161
+ - `groq/llama-3.3-70b-versatile`
162
+
163
+ </details>
164
+
165
+ <details>
166
+ <summary><img src="https://cdn.simpleicons.org/cerebras/313131" alt="" width="18" height="18"> <b>Cerebras</b></summary>
167
+
168
+ Get a key at [cloud.cerebras.ai](https://cloud.cerebras.ai/).
169
+
170
+ ```dotenv
171
+ CEREBRAS_API_KEY="csk_..."
172
+ MODEL="cerebras/gpt-oss-120b"
173
+ ```
174
+
175
+ Popular examples:
176
+
177
+ - `cerebras/gpt-oss-120b` (~3000 tok/s - Fastest reasoning)
178
+ - `cerebras/qwen-3-235b`
179
+ - `cerebras/llama3.1-8b`
180
+
181
+ </details>
182
+
183
+ ## Connect Claude Code
184
+
185
+ ### Claude Code CLI
186
+
187
+ ```bash
188
+ ANTHROPIC_AUTH_TOKEN="freecc" ANTHROPIC_BASE_URL="http://localhost:8082" claude
189
+ ```
190
+
191
+ ### VS Code Extension
192
+
193
+ Open Settings, search for `claude-code.environmentVariables`, choose **Edit in settings.json**, and add:
194
+
195
+ ```json
196
+ "claudeCode.environmentVariables": [
197
+ { "name": "ANTHROPIC_BASE_URL", "value": "http://localhost:8082" },
198
+ { "name": "ANTHROPIC_AUTH_TOKEN", "value": "freecc" }
199
+ ]
200
+ ```
201
+
202
+ Reload the extension. If the extension shows a login screen, choose the Anthropic Console path once; the local proxy still handles model traffic after the environment variables are active.
203
+
204
+ ### JetBrains ACP
205
+
206
+ Edit the installed Claude ACP config:
207
+
208
+ - Windows: `C:\Users\%USERNAME%\AppData\Roaming\JetBrains\acp-agents\installed.json`
209
+ - Linux/macOS: `~/.jetbrains/acp.json`
210
+
211
+ Set the environment for `acp.registry.claude-acp`:
212
+
213
+ ```json
214
+ "env": {
215
+ "ANTHROPIC_BASE_URL": "http://localhost:8082",
216
+ "ANTHROPIC_AUTH_TOKEN": "freecc"
217
+ }
218
+ ```
219
+
220
+ Restart the IDE after changing the file.
221
+
222
+ ### Model Picker
223
+
224
+ Claude Code 2.1.126 or later reads this proxy's `/v1/models` endpoint when `ANTHROPIC_BASE_URL` points at the proxy. Start Claude Code normally, run `/model`, and choose any discovered provider model.
225
+
226
+ <div align="center">
227
+ <img src="cc-model-picker.png" alt="Claude Code model picker showing gateway models" width="700">
228
+ </div>
229
+
230
+ The proxy lists models for configured provider keys and referenced local providers. Picker-safe IDs are routed back to the real provider/model automatically, so no `.env` edit or separate launcher script is needed after startup.
231
+
232
+ Each provider model also has a `(no thinking)` picker variant. Use it when a model does not support Claude Code thinking or fails with adaptive-thinking requests. It routes to the same upstream model while asking Claude Code to send a non-thinking request.
233
+
234
+ ## Optional Integrations
235
+
236
+ ### Discord And Telegram Bots
237
+
238
+ The bot wrapper runs Claude Code sessions remotely, streams progress, supports reply-based conversation branches, and can stop or clear tasks.
239
+
240
+ Discord minimum config:
241
+
242
+ ```dotenv
243
+ MESSAGING_PLATFORM="discord"
244
+ DISCORD_BOT_TOKEN="your-discord-bot-token"
245
+ ALLOWED_DISCORD_CHANNELS="123456789"
246
+ CLAUDE_WORKSPACE="./agent_workspace"
247
+ ALLOWED_DIR="C:/Users/yourname/projects"
248
+ ```
249
+
250
+ Create the bot in the [Discord Developer Portal](https://discord.com/developers/applications), enable Message Content Intent, and invite it with read/send/history permissions.
251
+
252
+ Telegram minimum config:
253
+
254
+ ```dotenv
255
+ MESSAGING_PLATFORM="telegram"
256
+ TELEGRAM_BOT_TOKEN="123456789:ABC..."
257
+ ALLOWED_TELEGRAM_USER_ID="your-user-id"
258
+ CLAUDE_WORKSPACE="./agent_workspace"
259
+ ALLOWED_DIR="C:/Users/yourname/projects"
260
+ ```
261
+
262
+ Get a token from [@BotFather](https://t.me/BotFather) and your user ID from [@userinfobot](https://t.me/userinfobot).
263
+
264
+ Useful commands:
265
+
266
+ - `/stop` cancels a task; reply to a task message to stop only that branch.
267
+ - `/clear` resets sessions; reply to clear one branch.
268
+ - `/stats` shows session state.
269
+
270
+ ### Voice Notes
271
+
272
+ Voice notes work on Discord and Telegram. Choose one backend:
273
+
274
+ ```bash
275
+ uv sync --extra voice_local
276
+ uv sync --extra voice
277
+ uv sync --extra voice --extra voice_local
278
+ ```
279
+
280
+ ```dotenv
281
+ VOICE_NOTE_ENABLED=true
282
+ WHISPER_DEVICE="cpu" # cpu | cuda | nvidia_nim
283
+ WHISPER_MODEL="base"
284
+ HF_TOKEN=""
285
+ ```
286
+
287
+ Use `WHISPER_DEVICE="nvidia_nim"` with the `voice` extra and `NVIDIA_NIM_API_KEY` for NVIDIA-hosted transcription.
288
+
289
+ ## Configuration Reference
290
+
291
+ [`.env.example`](.env.example) is the canonical list of variables. The sections below are the ones most users change.
292
+
293
+ ### Model Routing
294
+
295
+ ```dotenv
296
+ MODEL="nvidia_nim/z-ai/glm4.7"
297
+ MODEL_OPUS=
298
+ MODEL_SONNET=
299
+ MODEL_HAIKU=
300
+ ENABLE_MODEL_THINKING=true
301
+ ENABLE_OPUS_THINKING=
302
+ ENABLE_SONNET_THINKING=
303
+ ENABLE_HAIKU_THINKING=
304
+ ```
305
+
306
+ Blank per-tier values inherit the fallback. Blank thinking overrides inherit `ENABLE_MODEL_THINKING`.
307
+
308
+ ### Provider Keys And URLs
309
+
310
+ ```dotenv
311
+ NVIDIA_NIM_API_KEY=""
312
+ ```
313
+
314
+ Proxy settings are per provider:
315
+
316
+ ```dotenv
317
+ NVIDIA_NIM_PROXY=""
318
+ ```
319
+
320
+ ### Rate Limits And Timeouts
321
+
322
+ ```dotenv
323
+ PROVIDER_RATE_LIMIT=1
324
+ PROVIDER_RATE_WINDOW=3
325
+ PROVIDER_MAX_CONCURRENCY=5
326
+ HTTP_READ_TIMEOUT=120
327
+ HTTP_WRITE_TIMEOUT=10
328
+ HTTP_CONNECT_TIMEOUT=10
329
+ ```
330
+
331
+ Use lower limits for free hosted providers; local providers can usually tolerate higher concurrency if the machine can handle it.
332
+
333
+ ### Security And Diagnostics
334
+
335
+ ```dotenv
336
+ ANTHROPIC_AUTH_TOKEN=
337
+ LOG_RAW_API_PAYLOADS=false
338
+ LOG_RAW_SSE_EVENTS=false
339
+ LOG_API_ERROR_TRACEBACKS=false
340
+ LOG_RAW_MESSAGING_CONTENT=false
341
+ LOG_RAW_CLI_DIAGNOSTICS=false
342
+ LOG_MESSAGING_ERROR_DETAILS=false
343
+ ```
344
+
345
+ Raw logging flags can expose prompts, tool arguments, paths, and model output. Keep them off unless you are debugging locally.
346
+
347
+ ### Local Web Tools
348
+
349
+ ```dotenv
350
+ ENABLE_WEB_SERVER_TOOLS=true
351
+ WEB_FETCH_ALLOWED_SCHEMES=http,https
352
+ WEB_FETCH_ALLOW_PRIVATE_NETWORKS=false
353
+ ```
354
+
355
+ These tools perform outbound HTTP from the proxy. Keep private-network access disabled unless you are in a controlled lab environment.
356
+
357
+ ## Troubleshooting
358
+
359
+ ### **Major Fixes (May 2026)**
360
+
361
+ #### **1. Model Visibility & Caching Issues**
362
+ The Claude CLI often caches model lists, causing local proxy models to disappear.
363
+ - **Fix:** We implemented a "Multi-Model Advertisement" feature. The `MODEL` environment variable now supports a comma-separated list.
364
+ - **Action:** Set `MODEL="model1,model2,model3"` in your `.env`. The proxy will force the CLI to display all of them by registering them as primary models.
365
+
366
+ #### **2. The "Amnesia/Thinking" Loop**
367
+ When using `auto` mode, the proxy would sometimes switch models in the middle of a "Thinking" block if it took too long, causing the CLI to repeat the same thought endlessly.
368
+ - **Fix:** Implemented "Sticky Sessions" in `api/services.py`. Once a model yields its first event (including thinking blocks), the proxy commits to that model for the duration of the turn. Fallbacks only occur if the model fails to start entirely.
369
+
370
+ #### **3. NVIDIA NIM Fallback Sync**
371
+ Ensured that the `AUTO_MODEL_PRIORITY` and `NVIDIA_NIM_FALLBACK_MODELS` are synchronized to provide maximum coverage.
372
+
373
+ ### Claude Code says `undefined ... input_tokens`, `$.speed`, or malformed response
374
+
375
+
376
+ Update to the latest commit first. Older versions could emit invalid usage metadata in streaming responses. Then check:
377
+
378
+ - `ANTHROPIC_BASE_URL` is `http://localhost:8082`, not `http://localhost:8082/v1`.
379
+ - The proxy is returning Server-Sent Events for `/v1/messages`.
380
+ - `server.log` contains no upstream 400/500 response before the malformed-response error.
381
+
382
+
383
+ ### Provider disconnects during streaming
384
+
385
+ Errors like `incomplete chunked read`, `server disconnected`, or a peer closing the body usually come from the upstream provider or gateway. Reduce concurrency, raise timeouts, or retry later.
386
+
387
+ ### Tool calls work on one model but not another
388
+
389
+ Tool support is model and provider dependent. Some OpenAI-compatible models emit malformed tool-call deltas, omit tool names, or return tool calls as plain text. Try another model or provider before assuming the proxy is broken.
390
+
391
+ ### The VS Code extension still shows a login screen
392
+
393
+ Confirm the extension environment variables are set, then reload the extension or restart VS Code. The browser login flow may still appear once; the local proxy is used when `ANTHROPIC_BASE_URL` is active in the extension process.
394
+
395
+ ## How It Works
396
+
397
+ ```text
398
+ Claude Code CLI / IDE
399
+ |
400
+ | Anthropic Messages API
401
+ v
402
+ Free Claude Code proxy (:8082)
403
+ |
404
+ | provider-specific request/stream adapter
405
+ v
406
+ NVIDIA NIM
407
+ ```
408
+
409
+ Important pieces:
410
+
411
+ - FastAPI exposes Anthropic-compatible routes such as `/v1/messages`, `/v1/messages/count_tokens`, and `/v1/models`.
412
+ - Model routing resolves the Claude model name to `MODEL_OPUS`, `MODEL_SONNET`, `MODEL_HAIKU`, or `MODEL`.
413
+ - NVIDIA NIM uses OpenAI chat streaming translated into Anthropic SSE.
414
+ - The proxy normalizes thinking blocks, tool calls, token usage metadata, and provider errors into the shape Claude Code expects.
415
+ - Request optimizations answer trivial Claude Code probes locally to save latency and quota.
416
+
417
+ ## Development
418
+
419
+ ### Project Structure
420
+
421
+ ```text
422
+ free-claude-code/
423
+ β”œβ”€β”€ server.py # ASGI entry point
424
+ β”œβ”€β”€ api/ # FastAPI routes, service layer, routing, optimizations
425
+ β”œβ”€β”€ core/ # Shared Anthropic protocol helpers and SSE utilities
426
+ β”œβ”€β”€ providers/ # Provider transports, registry, rate limiting
427
+ β”œβ”€β”€ messaging/ # Discord/Telegram adapters, sessions, voice
428
+ β”œβ”€β”€ cli/ # Package entry points and Claude process management
429
+ β”œβ”€β”€ config/ # Settings, provider catalog, logging
430
+ └��─ tests/ # Unit and contract tests
431
+ ```
432
+
433
+ ### Commands
434
+
435
+ ```bash
436
+ uv run ruff format
437
+ uv run ruff check
438
+ uv run ty check
439
+ uv run pytest
440
+ ```
441
+
442
+ Run them in that order before pushing. CI enforces the same checks.
443
+
444
+ ### Package Scripts
445
+
446
+ `pyproject.toml` installs:
447
+
448
+ - `free-claude-code`: starts the proxy with configured host and port.
449
+ - `fcc-init`: creates the user config template at `~/.config/free-claude-code/.env`.
450
+
451
+ ### Extending
452
+
453
+ - Add messaging platforms by implementing the `MessagingPlatform` interface in `messaging/`.
454
+ - Extend NVIDIA NIM provider functionality by modifying `providers/nvidia_nim/`.
455
+
456
+ ## Contributing
457
+
458
+ - Report bugs and feature requests in [Issues](https://github.com/Alishahryar1/free-claude-code/issues).
459
+ - Keep changes small and covered by focused tests.
460
+ - Do not open Docker integration PRs.
461
+ - Do not open README change PRs just open an issue for it.
462
+ - Run the full check sequence before opening a pull request.
463
+ - The syntax Except X, Y is brought back in python 3.14 final version (not in 3.14 alpha). Keep in mind before opening PRs.
464
+
465
+ ## NVIDIA Qwen integration
466
+
467
+ You can run a simple NVIDIA Qwen streaming example using the OpenAI-compatible client shipped below.
468
+
469
+ - Install the dependency:
470
+
471
+ ```bash
472
+ pip install -r requirements.txt
473
+ ```
474
+
475
+ - Set your NVIDIA API key (do NOT commit keys). Example (PowerShell temporary):
476
+
477
+ ```powershell
478
+ $env:NV_API_KEY = "nvapi-<YOUR_KEY>"
479
+ python nvidia_integration.py "Write a short Python script that prints Hello"
480
+ ```
481
+
482
+ Persisted (Windows):
483
+
484
+ ```powershell
485
+ setx NV_API_KEY "nvapi-<YOUR_KEY>"
486
+ # open a new shell to use the persisted variable
487
+ ```
488
+
489
+ Linux/macOS:
490
+
491
+ ```bash
492
+ export NV_API_KEY="nvapi-<YOUR_KEY>"
493
+ python nvidia_integration.py "Write a short Python script that prints Hello"
494
+ ```
495
+
496
+ The example `nvidia_integration.py` streams completions from `https://integrate.api.nvidia.com/v1` using the `qwen/qwen3-coder-480b-a35b-instruct` model. Replace `<YOUR_KEY>` with your actual NVIDIA API key. Never share or commit your API keys.
497
+
498
+ ## License
499
+
500
+ MIT License. See [LICENSE](LICENSE) for details.