raheem786 commited on
Commit
25e7393
·
verified ·
1 Parent(s): d4b1d2b

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. README.md +14 -32
  2. docker-compose.yaml +2 -1
  3. litellm-config-auto.yaml +24 -32
README.md CHANGED
@@ -10,18 +10,15 @@ pinned: false
10
 
11
  # LiteLLM Proxy (Render & Hugging Face Space)
12
 
13
- LiteLLM proxy for OpenRouter, Hugging Face, and other providers, deployable to Render, Hugging Face Spaces (Docker), and runnable locally with Docker.
14
 
15
  ## Fix: "Authentication Error, No api key passed in" (401)
16
 
17
- This happens when the proxy is configured with a **master key** but either:
18
-
19
- 1. **Render:** Required environment variables are missing or wrong, or
20
- 2. **Client:** The request does not send the API key in the header.
21
 
22
  ### 1. Set environment variables on Render
23
 
24
- The config reads **from environment variables** (`os.environ/LITELLM_MASTER_KEY`, `os.environ/OPENROUTER_API_KEY`). No secret file needed.
25
 
26
  1. In [Render Dashboard](https://dashboard.render.com) → your Web Service → **Environment**.
27
  2. Under **Environment Variables**, click **+ Add Environment Variable**.
@@ -31,6 +28,7 @@ The config reads **from environment variables** (`os.environ/LITELLM_MASTER_KEY`
31
  |-----|-------|--------|
32
  | `LITELLM_MASTER_KEY` | e.g. `sk-your-secret-key` | Mark as **Secret**. This is the key clients send. |
33
  | `OPENROUTER_API_KEY` | Your OpenRouter API key | Mark as **Secret**. |
 
34
  | `HF_TOKEN` | Your Hugging Face token (`hf_...`) | Optional. Mark as **Secret**. Required only for Hugging Face models (`my-hf-models`). Create at [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens). |
35
  | `PORT` | (optional) | Render sets this automatically; no need to add. |
36
 
@@ -38,44 +36,28 @@ The config reads **from environment variables** (`os.environ/LITELLM_MASTER_KEY`
38
 
39
  ### 2. Send the API key from your client
40
 
41
- Every request to the proxy must include:
42
-
43
- ```http
44
- Authorization: Bearer <your-LITELLM_MASTER_KEY>
45
- ```
46
 
47
- - **Cursor:** Settings → Models Add your Render URL as base URL and set **API Key** to the same value as `LITELLM_MASTER_KEY`.
48
- - **curl:**
49
- ```bash
50
- curl -X POST https://your-app.onrender.com/v1/chat/completions \
51
- -H "Authorization: Bearer sk-your-master-key" \
52
- -H "Content-Type: application/json" \
53
- -d '{"model": "my-free-models", "messages": [{"role": "user", "content": "Hi"}]}'
54
- ```
55
- - **OpenAI SDK / other clients:** Set the API key to your `LITELLM_MASTER_KEY` when using the proxy URL.
56
 
57
- If the env vars are set on Render and you send the master key in `Authorization: Bearer ...`, the 401 should go away.
58
 
59
  ## Local run (Docker Compose)
60
 
61
- ```bash
62
- cp .env.example .env # if you have one, or create .env with OPENROUTER_API_KEY and LITELLM_MASTER_KEY
63
- docker compose up --build
64
- ```
65
-
66
- Port is controlled by `PORT` in `.env` (default 4000).
67
 
68
  ## Deploy to Render
69
 
70
  1. Connect this repo to Render and create a **Web Service**.
71
- 2. Render will use the **Dockerfile**; no build/start command needed.
72
- 3. Add **Environment Variables** (see above): `LITELLM_MASTER_KEY` and `OPENROUTER_API_KEY` (mark as Secret).
73
- 4. Deploy. Use the service URL and the same master key in `Authorization: Bearer ...` from your app or Cursor.
74
 
75
  ## Deploy to Hugging Face Spaces (Docker)
76
 
77
  The README includes Spaces frontmatter (`sdk: docker`, `app_port: 7860`). To run this repo as a Space:
78
 
79
  1. Create a new Space at [huggingface.co/new-space](https://huggingface.co/new-space), choose **Docker**, and use this repo (or push this repo to the Space).
80
- 2. In the Space **Settings** → **Variables and secrets**, add **Secrets**: `LITELLM_MASTER_KEY`, `OPENROUTER_API_KEY` (and optionally `HF_TOKEN` for Hugging Face models). Add a **Variable** `PORT` = `7860` so the proxy listens on the Space’s expected port.
81
- 3. The Space will build from the **Dockerfile** and run the proxy. Use the Space URL (e.g. `https://your-username-litellm-proxy.hf.space`) with `Authorization: Bearer <LITELLM_MASTER_KEY>`.
 
10
 
11
  # LiteLLM Proxy (Render & Hugging Face Space)
12
 
13
+ LiteLLM proxy for OpenRouter, Google (Gemini), Hugging Face, and other providers, deployable to Render, Hugging Face Spaces (Docker), and runnable locally with Docker.
14
 
15
  ## Fix: "Authentication Error, No api key passed in" (401)
16
 
17
+ This happens when the proxy is configured with a **master key** but either required environment variables are missing or wrong on the server, or the client does not send the API key in the request header.
 
 
 
18
 
19
  ### 1. Set environment variables on Render
20
 
21
+ The config reads from environment variables. No secret file is needed.
22
 
23
  1. In [Render Dashboard](https://dashboard.render.com) → your Web Service → **Environment**.
24
  2. Under **Environment Variables**, click **+ Add Environment Variable**.
 
28
  |-----|-------|--------|
29
  | `LITELLM_MASTER_KEY` | e.g. `sk-your-secret-key` | Mark as **Secret**. This is the key clients send. |
30
  | `OPENROUTER_API_KEY` | Your OpenRouter API key | Mark as **Secret**. |
31
+ | `GOOGLE_API_KEY` | Your Google AI API key | Mark as **Secret**. Required for Gemini models (`my-free-coders-new` with Gemini). |
32
  | `HF_TOKEN` | Your Hugging Face token (`hf_...`) | Optional. Mark as **Secret**. Required only for Hugging Face models (`my-hf-models`). Create at [huggingface.co/settings/tokens](https://huggingface.co/settings/tokens). |
33
  | `PORT` | (optional) | Render sets this automatically; no need to add. |
34
 
 
36
 
37
  ### 2. Send the API key from your client
38
 
39
+ Every request to the proxy must include the header **Authorization** with value **Bearer** followed by your `LITELLM_MASTER_KEY` value.
 
 
 
 
40
 
41
+ - **Cursor:** In Settings → Models, set your Render or Space URL as the base URL and set the API Key to the same value as `LITELLM_MASTER_KEY`.
42
+ - **OpenAI-compatible clients:** Configure the base URL to your proxy URL and set the API key to your `LITELLM_MASTER_KEY`.
 
 
 
 
 
 
 
43
 
44
+ If the env vars are set on the server and you send the master key in the Authorization header, the 401 should go away.
45
 
46
  ## Local run (Docker Compose)
47
 
48
+ Create a `.env` file in the project root with `OPENROUTER_API_KEY`, `LITELLM_MASTER_KEY`, and optionally `GOOGLE_API_KEY` and `HF_TOKEN`. Then run the app with Docker Compose (build and start the service). Port is controlled by `PORT` in `.env`; default is 4000.
 
 
 
 
 
49
 
50
  ## Deploy to Render
51
 
52
  1. Connect this repo to Render and create a **Web Service**.
53
+ 2. Render will use the **Dockerfile**; no build or start command is required.
54
+ 3. Add **Environment Variables** (see above): at least `LITELLM_MASTER_KEY` and `OPENROUTER_API_KEY` (mark as Secret). Add `GOOGLE_API_KEY` if you use Gemini models.
55
+ 4. Deploy. Use the service URL with the same master key in the Authorization header from your app or Cursor.
56
 
57
  ## Deploy to Hugging Face Spaces (Docker)
58
 
59
  The README includes Spaces frontmatter (`sdk: docker`, `app_port: 7860`). To run this repo as a Space:
60
 
61
  1. Create a new Space at [huggingface.co/new-space](https://huggingface.co/new-space), choose **Docker**, and use this repo (or push this repo to the Space).
62
+ 2. In the Space **Settings** → **Variables and secrets**, add **Secrets**: `LITELLM_MASTER_KEY`, `OPENROUTER_API_KEY`, and optionally `GOOGLE_API_KEY` and `HF_TOKEN`. If the Space does not set `PORT` for you, add a **Variable** `PORT` = `7860` so the proxy listens on the expected port.
63
+ 3. The Space will build from the **Dockerfile** and run the proxy. Use the Space URL with **Authorization: Bearer** and your `LITELLM_MASTER_KEY`.
docker-compose.yaml CHANGED
@@ -5,10 +5,11 @@ services:
5
  ports:
6
  - "${PORT:-4000}:${PORT:-4000}"
7
  volumes:
8
- - ./litellm-config-auto.yaml:/app/config.yaml
9
  environment:
10
  - OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
11
  - LITELLM_MASTER_KEY=${LITELLM_MASTER_KEY:-sk-1234}
 
12
  - HF_TOKEN=${HF_TOKEN}
13
  - PORT=${PORT:-4000}
14
  restart: always
 
5
  ports:
6
  - "${PORT:-4000}:${PORT:-4000}"
7
  volumes:
8
+ - ./litellm-config-auto.yaml:/home/user/app/config.yaml
9
  environment:
10
  - OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
11
  - LITELLM_MASTER_KEY=${LITELLM_MASTER_KEY:-sk-1234}
12
+ - GOOGLE_API_KEY=${GOOGLE_API_KEY}
13
  - HF_TOKEN=${HF_TOKEN}
14
  - PORT=${PORT:-4000}
15
  restart: always
litellm-config-auto.yaml CHANGED
@@ -5,26 +5,7 @@ litellm_settings:
5
  drop_params: False
6
  modify_params: True
7
  set_verbose: False
8
- num_retries: 5
9
- # These belong here (global request defaults)
10
  request_timeout: 30
11
- allowed_fails: 1
12
- cooldown_time: 60
13
- default_completion_params:
14
- max_tokens: 4096
15
- trim_ratio: 0.75
16
- extra_body:
17
- transforms: ["middle-out"]
18
-
19
- router_settings:
20
- # Move these here for them to actually work!
21
- routing_strategy: latency-based-routing
22
- num_retries: 3 # Increase to 3 for better resilience
23
- allowed_fails: 2
24
- cooldown_time: 30
25
- fallbacks: [{"my-free-coders-new": ["my-free-coders-new"]}]
26
- context_window_fallbacks: [{"my-free-coders-new": ["my-free-coders-new"]}]
27
-
28
 
29
  model_list:
30
  - model_name: my-free-models
@@ -107,18 +88,6 @@ model_list:
107
  litellm_params:
108
  model: openrouter/arcee-ai/trinity-large-preview:free
109
  api_key: "os.environ/OPENROUTER_API_KEY"
110
- # - model_name: my-free-coders-new
111
- # litellm_params:
112
- # model: openrouter/google/gemini-2.0-flash-exp:free # Most stable
113
- # api_key: "os.environ/OPENROUTER_API_KEY"
114
- # - model_name: my-free-coders-new
115
- # litellm_params:
116
- # model: openrouter/meta-llama/llama-3.1-8b-instruct:free # Fallback 1
117
- # api_key: "os.environ/OPENROUTER_API_KEY"
118
- # - model_name: my-free-coders-new
119
- # litellm_params:
120
- # model: openrouter/qwen/qwen-2.5-72b-instruct:free # Fallback 2
121
- # api_key: "os.environ/OPENROUTER_API_KEY"
122
  - model_name: my-paid-coders
123
  litellm_params:
124
  model: openrouter/openai/gpt-oss-20b
@@ -143,7 +112,6 @@ model_list:
143
  litellm_params:
144
  model: openrouter/meta-llama/llama-3-8b-instruct
145
  api_key: "os.environ/OPENROUTER_API_KEY"
146
- # Hugging Face (set HF_TOKEN in env); format: huggingface/<provider>/<org>/<model>
147
  - model_name: my-hf-models
148
  litellm_params:
149
  model: huggingface/meta-llama/Llama-3.3-70B-Instruct
@@ -155,3 +123,27 @@ model_list:
155
 
156
  router_settings:
157
  routing_strategy: latency-based-routing
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  drop_params: False
6
  modify_params: True
7
  set_verbose: False
 
 
8
  request_timeout: 30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
 
10
  model_list:
11
  - model_name: my-free-models
 
88
  litellm_params:
89
  model: openrouter/arcee-ai/trinity-large-preview:free
90
  api_key: "os.environ/OPENROUTER_API_KEY"
 
 
 
 
 
 
 
 
 
 
 
 
91
  - model_name: my-paid-coders
92
  litellm_params:
93
  model: openrouter/openai/gpt-oss-20b
 
112
  litellm_params:
113
  model: openrouter/meta-llama/llama-3-8b-instruct
114
  api_key: "os.environ/OPENROUTER_API_KEY"
 
115
  - model_name: my-hf-models
116
  litellm_params:
117
  model: huggingface/meta-llama/Llama-3.3-70B-Instruct
 
123
 
124
  router_settings:
125
  routing_strategy: latency-based-routing
126
+ num_retries: 3
127
+ allowed_fails: 2
128
+ cooldown_time: 30
129
+ retry_policy:
130
+ AuthenticationErrorRetries: 3
131
+ TimeoutErrorRetries: 3
132
+ RateLimitErrorRetries: 3
133
+ ContentPolicyViolationErrorRetries: 4
134
+ InternalServerErrorRetries: 4
135
+ allowed_fails_policy:
136
+ BadRequestErrorAllowedFails: 1000
137
+ AuthenticationErrorAllowedFails: 10
138
+ TimeoutErrorAllowedFails: 12
139
+ RateLimitErrorAllowedFails: 10000
140
+ ContentPolicyViolationErrorAllowedFails: 15
141
+ InternalServerErrorAllowedFails: 20
142
+ fallbacks:
143
+ - my-free-coders-new:
144
+ - my-free-models
145
+ context_window_fallbacks:
146
+ - my-free-coders-new:
147
+ - my-free-models
148
+ default_litellm_params:
149
+ max_tokens: 4096