gaoqilan commited on
Commit
42f26e3
·
verified ·
1 Parent(s): 73a56aa

Upload 3 files

Browse files
Files changed (3) hide show
  1. Dockerfile +11 -0
  2. README.md +615 -5
  3. README_CN.md +622 -0
Dockerfile ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.10.13 AS builder
2
+ COPY ./requirements.txt /home
3
+ RUN pip install -r /home/requirements.txt
4
+
5
+ FROM python:3.10.13-slim-bullseye
6
+ EXPOSE 8000
7
+ WORKDIR /home
8
+ COPY --from=builder /usr/local/lib/python3.10/site-packages /usr/local/lib/python3.10/site-packages
9
+ COPY . /home
10
+ ENV WATCHFILES_FORCE_POLLING=true
11
+ ENTRYPOINT ["python", "-m", "uvicorn", "main:app", "--host", "0.0.0.0", "--reload", "--reload-include", "*.yaml"]
README.md CHANGED
@@ -1,11 +1,621 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
- title: Uni Api
3
  emoji: 🌍
4
- colorFrom: red
5
- colorTo: pink
6
  sdk: docker
 
7
  pinned: false
8
- short_description: API转发器
9
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
1
+ # uni-api
2
+
3
+ <p align="center">
4
+ <a href="https://t.me/uni_api">
5
+ <img src="https://img.shields.io/badge/Join Telegram Group-blue?&logo=telegram">
6
+ </a>
7
+ <a href="https://hub.docker.com/repository/docker/yym68686/uni-api">
8
+ <img src="https://img.shields.io/docker/pulls/yym68686/uni-api?color=blue" alt="docker pull">
9
+ </a>
10
+ </p>
11
+
12
+ [English](./README.md) | [Chinese](./README_CN.md)
13
+
14
+ ## Introduction
15
+
16
+ For personal use, one/new-api is too complex with many commercial features that individuals don't need. If you don't want a complicated frontend interface and prefer support for more models, you can try uni-api. This is a project that unifies the management of large language model APIs, allowing you to call multiple backend services through a single unified API interface, converting them all to OpenAI format, and supporting load balancing. Currently supported backend services include: OpenAI, Anthropic, Gemini, Vertex, Azure, xai, Cohere, Groq, Cloudflare, OpenRouter, and more.
17
+
18
+ ## ✨ Features
19
+
20
+ - No front-end, pure configuration file to configure API channels. You can run your own API station just by writing a file, and the documentation has a detailed configuration guide, beginner-friendly.
21
+ - Unified management of multiple backend services, supporting providers such as OpenAI, Deepseek, OpenRouter, and other APIs in OpenAI format. Supports OpenAI Dalle-3 image generation.
22
+ - Simultaneously supports Anthropic, Gemini, Vertex AI, Azure, xai, Cohere, Groq, Cloudflare. Vertex simultaneously supports Claude and Gemini API.
23
+ - Support OpenAI, Anthropic, Gemini, Vertex, Azure, xai native tool use function calls.
24
+ - Support OpenAI, Anthropic, Gemini, Vertex, Azure, xai native image recognition API.
25
+ - Support four types of load balancing.
26
+ 1. Supports channel-level weighted load balancing, allowing requests to be distributed according to different channel weights. It is not enabled by default and requires configuring channel weights.
27
+ 2. Support Vertex regional load balancing and high concurrency, which can increase Gemini and Claude concurrency by up to (number of APIs * number of regions) times. Automatically enabled without additional configuration.
28
+ 3. Except for Vertex region-level load balancing, all APIs support channel-level sequential load balancing, enhancing the immersive translation experience. It is not enabled by default and requires configuring `SCHEDULING_ALGORITHM` as `round_robin`.
29
+ 4. Support automatic API key-level round-robin load balancing for multiple API Keys in a single channel.
30
+ - Support automatic retry, when an API channel response fails, automatically retry the next API channel.
31
+ - Support channel cooling: When an API channel response fails, the channel will automatically be excluded and cooled for a period of time, and requests to the channel will be stopped. After the cooling period ends, the model will automatically be restored until it fails again, at which point it will be cooled again.
32
+ - Support fine-grained model timeout settings, allowing different timeout durations for each model.
33
+ - Support fine-grained permission control. Support using wildcards to set specific models available for API key channels.
34
+ - Support rate limiting, you can set the maximum number of requests per minute as an integer, such as 2/min, 2 times per minute, 5/hour, 5 times per hour, 10/day, 10 times per day, 10/month, 10 times per month, 10/year, 10 times per year. Default is 60/min.
35
+ - Supports multiple standard OpenAI format interfaces: `/v1/chat/completions`, `/v1/images/generations`, `/v1/audio/transcriptions`, `/v1/moderations`, `/v1/models`.
36
+ - Support OpenAI moderation moral review, which can conduct moral reviews of user messages. If inappropriate messages are found, an error message will be returned. This reduces the risk of the backend API being banned by providers.
37
+
38
+ ## Usage method
39
+
40
+ To start uni-api, a configuration file must be used. There are two ways to start with a configuration file:
41
+
42
+ 1. The first method is to use the `CONFIG_URL` environment variable to fill in the configuration file URL, which will be automatically downloaded when uni-api starts.
43
+ 2. The second method is to mount a configuration file named `api.yaml` into the container.
44
+
45
+ ### Method 1: Mount the `api.yaml` configuration file to start uni-api
46
+
47
+ You must fill in the configuration file in advance to start `uni-api`, and you must use a configuration file named `api.yaml` to start `uni-api`, you can configure multiple models, each model can configure multiple backend services, and support load balancing. Below is an example of the minimum `api.yaml` configuration file that can be run:
48
+
49
+ ```yaml
50
+ providers:
51
+ - provider: provider_name # Service provider name, such as openai, anthropic, gemini, openrouter, can be any name, required
52
+ base_url: https://api.your.com/v1/chat/completions # Backend service API address, required
53
+ api: sk-YgS6GTi0b4bEabc4C # Provider's API Key, required, automatically uses base_url and api to get all available models through the /v1/models endpoint.
54
+ # Multiple providers can be configured here, each provider can configure multiple API Keys, and each provider can configure multiple models.
55
+ api_keys:
56
+ - api: sk-Pkj60Yf8JFWxfgRmXQFWyGtWUddGZnmi3KlvowmRWpWpQxx # API Key, user request uni-api requires API key, required
57
+ # This API Key can use all models, that is, it can use all models in all channels set under providers, without needing to add available channels one by one.
58
+ ```
59
+
60
+ Detailed advanced configuration of `api.yaml`:
61
+
62
+ ```yaml
63
+ providers:
64
+ - provider: provider_name # Service provider name, such as openai, anthropic, gemini, openrouter, can be any name, required
65
+ base_url: https://api.your.com/v1/chat/completions # Backend service API address, required
66
+ api: sk-YgS6GTi0b4bEabc4C # Provider's API Key, required
67
+ model: # Optional, if model is not configured, all available models will be automatically obtained through base_url and api via the /v1/models endpoint.
68
+ - gpt-4o # Usable model name, required
69
+ - claude-3-5-sonnet-20240620: claude-3-5-sonnet # Rename model, claude-3-5-sonnet-20240620 is the provider's model name, claude-3-5-sonnet is the renamed name, you can use a simple name to replace the original complex name, optional
70
+ - dall-e-3
71
+
72
+ - provider: anthropic
73
+ base_url: https://api.anthropic.com/v1/messages
74
+ api: # Supports multiple API Keys, multiple keys automatically enable polling load balancing, at least one key, required
75
+ - sk-ant-api03-bNnAOJyA-xQw_twAA
76
+ - sk-ant-api02-bNnxxxx
77
+ model:
78
+ - claude-3-7-sonnet-20240620: claude-3-7-sonnet # Rename model, claude-3-7-sonnet-20240620 is the provider's model name, claude-3-7-sonnet is the renamed name, you can use a simple name to replace the original complex name, optional
79
+ - claude-3-7-sonnet-20250219: claude-3-7-sonnet-think # Rename model, claude-3-7-sonnet-20250219 is the provider's model name, claude-3-7-sonnet-think is the renamed name, if "think" is in the renamed name, it will be automatically converted to claude think model, default think token limit is 4096. Optional
80
+ tools: true # Whether to support tools, such as generating code, generating documents, etc., default is true, optional
81
+
82
+ - provider: gemini
83
+ base_url: https://generativelanguage.googleapis.com/v1beta # base_url supports v1beta/v1, only for Gemini model use, required
84
+ api: # Supports multiple API Keys, multiple keys automatically enable polling load balancing, at least one key, required
85
+ - AIzaSyAN2k6IRdgw123
86
+ - AIzaSyAN2k6IRdgw456
87
+ - AIzaSyAN2k6IRdgw789
88
+ model:
89
+ - gemini-1.5-pro
90
+ - gemini-1.5-flash-exp-0827: gemini-1.5-flash # After renaming, the original model name gemini-1.5-flash-exp-0827 cannot be used, if you want to use the original name, you can add the original name in the model, just add the line below to use the original name
91
+ - gemini-1.5-flash-exp-0827 # Add this line, both gemini-1.5-flash-exp-0827 and gemini-1.5-flash can be requested
92
+ - gemini-1.5-pro: gemini-1.5-pro-search # Support to rename models with -search suffix to enable search, use gemini-1.5-pro-search model to request uni-api, it means gemini-1.5-pro model automatically uses Google official search tool, supports all 1.5/2.0 series models.
93
+ tools: true
94
+ preferences:
95
+ api_key_rate_limit: 15/min # Each API Key can request up to 15 times per minute, optional. The default is 999999/min. Supports multiple frequency constraints: 15/min,10/day
96
+ # api_key_rate_limit: # You can set different frequency limits for each model
97
+ # gemini-1.5-flash: 15/min,1500/day
98
+ # gemini-1.5-pro: 2/min,50/day
99
+ # default: 4/min # If the model does not set the frequency limit, use the frequency limit of default
100
+ api_key_cooldown_period: 60 # Each API Key will be cooled down for 60 seconds after encountering a 429 error. Optional, the default is 0 seconds. When set to 0, the cooling mechanism is not enabled. When there are multiple API keys, the cooling mechanism will take effect.
101
+ api_key_schedule_algorithm: round_robin # Set the request order of multiple API Keys, optional. The default is round_robin, and the optional values are: round_robin, random, fixed_priority. It will take effect when there are multiple API keys. round_robin is polling load balancing, and random is random load balancing. fixed_priority is fixed priority scheduling, always use the first available API key.
102
+ model_timeout: # Model timeout, in seconds, default 100 seconds, optional
103
+ gemini-1.5-pro: 10 # Model gemini-1.5-pro timeout is 10 seconds
104
+ gemini-1.5-flash: 10 # Model gemini-1.5-flash timeout is 10 seconds
105
+ default: 10 # Model does not have a timeout set, use the default timeout of 10 seconds, when requesting a model not in model_timeout, the timeout is also 10 seconds, if default is not set, uni-api will use the default timeout set by the environment variable TIMEOUT, the default timeout is 100 seconds
106
+ keepalive_interval: # Heartbeat interval, in seconds, default 80 seconds, optional. Suitable for when uni-api is hosted on cloudflare and uses inference models. Priority is higher than the global configuration keepalive_interval.
107
+ gemini-2.5-pro: 50 # Model gemini-2.5-pro heartbeat interval is 50 seconds, this value must be less than the model_timeout set timeout, otherwise it will be ignored.
108
+ proxy: socks5://[username]:[password]@[ip]:[port] # Proxy address, optional. Supports socks5 and http proxies, default is not used.
109
+ headers: # Add custom http request headers, optional
110
+ Custom-Header-1: Value-1
111
+ Custom-Header-2: Value-2
112
+
113
+ - provider: vertex
114
+ project_id: gen-lang-client-xxxxxxxxxxxxxx # Description: Your Google Cloud project ID. Format: String, usually composed of lowercase letters, numbers, and hyphens. How to obtain: You can find your project ID in the project selector of the Google Cloud Console.
115
+ private_key: "-----BEGIN PRIVATE KEY-----\nxxxxx\n-----END PRIVATE" # Description: Private key for Google Cloud Vertex AI service account. Format: A JSON formatted string containing the private key information of the service account. How to obtain: Create a service account in Google Cloud Console, generate a JSON formatted key file, and then set its content as the value of this environment variable.
116
+ client_email: xxxxxxxxxx@xxxxxxx.gserviceaccount.com # Description: Email address of the Google Cloud Vertex AI service account. Format: Usually a string like "service-account-name@project-id.iam.gserviceaccount.com". How to obtain: Generated when creating a service account, or you can view the service account details in the "IAM and Admin" section of the Google Cloud Console.
117
+ model:
118
+ - gemini-1.5-pro
119
+ - gemini-1.5-flash
120
+ - gemini-1.5-pro: gemini-1.5-pro-search # Only supports using the gemini-1.5-pro-search model to request uni-api when using the Vertex Gemini API, to automatically use the Google official search tool.
121
+ - claude-3-5-sonnet@20240620: claude-3-5-sonnet
122
+ - claude-3-opus@20240229: claude-3-opus
123
+ - claude-3-sonnet@20240229: claude-3-sonnet
124
+ - claude-3-haiku@20240307: claude-3-haiku
125
+ tools: true
126
+ notes: https://xxxxx.com/ # You can put the provider's website, notes, official documentation, optional
127
+
128
+ - provider: cloudflare
129
+ api: f42b3xxxxxxxxxxq4aoGAh # Cloudflare API Key, required
130
+ cf_account_id: 8ec0xxxxxxxxxxxxe721 # Cloudflare Account ID, required
131
+ model:
132
+ - '@cf/meta/llama-3.1-8b-instruct': llama-3.1-8b # Rename model, @cf/meta/llama-3.1-8b-instruct is the provider's original model name, must be enclosed in quotes, otherwise yaml syntax error, llama-3.1-8b is the renamed name, you can use a simple name to replace the original complex name, optional
133
+ - '@cf/meta/llama-3.1-8b-instruct' # Must be enclosed in quotes, otherwise yaml syntax error
134
+
135
+ - provider: azure
136
+ base_url: https://your-endpoint.openai.azure.com
137
+ api: your-api-key
138
+ model:
139
+ - gpt-4o
140
+
141
+ - provider: other-provider
142
+ base_url: https://api.xxx.com/v1/messages
143
+ api: sk-bNnAOJyA-xQw_twAA
144
+ model:
145
+ - causallm-35b-beta2ep-q6k: causallm-35b
146
+ - anthropic/claude-3-5-sonnet
147
+ tools: false
148
+ engine: openrouter # Force the use of a specific message format, currently supports gpt, claude, gemini, openrouter native format, optional
149
+
150
+ api_keys:
151
+ - api: sk-KjjI60Yf0JFWxfgRmXqFWyGtWUd9GZnmi3KlvowmRWpWpQRo # API Key, required for users to use this service
152
+ model: # Models that can be used by this API Key, required. Default channel-level polling load balancing is enabled, and each request model is requested in sequence according to the model configuration. It is not related to the original channel order in providers. Therefore, you can set different request sequences for each API key.
153
+ - gpt-4o # Usable model name, can use all gpt-4o models provided by providers
154
+ - claude-3-5-sonnet # Usable model name, can use all claude-3-5-sonnet models provided by providers
155
+ - gemini/* # Usable model name, can only use all models provided by providers named gemini, where gemini is the provider name, * represents all models
156
+ role: admin # Set the alias of the API key, optional. The request log will display the alias of the API key. If role is admin, only this API key can request the v1/stats,/v1/generate-api-key endpoints. If all API keys do not have role set to admin, the first API key is set as admin and has permission to request the v1/stats,/v1/generate-api-key endpoints.
157
+
158
+ - api: sk-pkhf60Yf0JGyJxgRmXqFQyTgWUd9GZnmi3KlvowmRWpWqrhy
159
+ model:
160
+ - anthropic/claude-3-5-sonnet # Usable model name, can only use the claude-3-5-sonnet model provided by the provider named anthropic. Models with the same name from other providers cannot be used. This syntax will not match the model named anthropic/claude-3-5-sonnet provided by other-provider.
161
+ - <anthropic/claude-3-5-sonnet> # By adding angle brackets on both sides of the model name, it will not search for the claude-3-5-sonnet model under the channel named anthropic, but will take the entire anthropic/claude-3-5-sonnet as the model name. This syntax can match the model named anthropic/claude-3-5-sonnet provided by other-provider. But it will not match the claude-3-5-sonnet model under anthropic.
162
+ - openai-test/text-moderation-latest # When message moderation is enabled, the text-moderation-latest model under the channel named openai-test can be used for moderation.
163
+ - sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo/* # Support using other API keys as channels
164
+ preferences:
165
+ SCHEDULING_ALGORITHM: fixed_priority # When SCHEDULING_ALGORITHM is fixed_priority, use fixed priority scheduling, always execute the channel of the first model with a request. Default is enabled, SCHEDULING_ALGORITHM default value is fixed_priority. SCHEDULING_ALGORITHM optional values are: fixed_priority, round_robin, weighted_round_robin, lottery, random.
166
+ # When SCHEDULING_ALGORITHM is random, use random polling load balancing, randomly request the channel of the model with a request.
167
+ # When SCHEDULING_ALGORITHM is round_robin, use polling load balancing, request the channel of the model used by the user in order.
168
+ AUTO_RETRY: true # Whether to automatically retry, automatically retry the next provider, true for automatic retry, false for no automatic retry, default is true. Also supports setting a number, indicating the number of retries.
169
+ rate_limit: 15/min # Supports rate limiting, each API Key can request up to 15 times per minute, optional. The default is 999999/min. Supports multiple frequency constraints: 15/min,10/day
170
+ # rate_limit: # You can set different frequency limits for each model
171
+ # gemini-1.5-flash: 15/min,1500/day
172
+ # gemini-1.5-pro: 2/min,50/day
173
+ # default: 4/min # If the model does not set the frequency limit, use the frequency limit of default
174
+ ENABLE_MODERATION: true # Whether to enable message moderation, true for enable, false for disable, default is false, when enabled, it will moderate the user's message, if inappropriate messages are found, an error message will be returned.
175
+
176
+ # Channel-level weighted load balancing configuration example
177
+ - api: sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo
178
+ model:
179
+ - gcp1/*: 5 # The number after the colon is the weight, weight only supports positive integers.
180
+ - gcp2/*: 3 # The size of the number represents the weight, the larger the number, the greater the probability of the request.
181
+ - gcp3/*: 2 # In this example, there are a total of 10 weights for all channels, and 10 requests will have 5 requests for the gcp1/* model, 2 requests for the gcp2/* model, and 3 requests for the gcp3/* model.
182
+
183
+ preferences:
184
+ SCHEDULING_ALGORITHM: weighted_round_robin # Only when SCHEDULING_ALGORITHM is weighted_round_robin and the above channel has weights, it will request according to the weighted order. Use weighted polling load balancing, request the channel of the model with a request according to the weight order. When SCHEDULING_ALGORITHM is lottery, use lottery polling load balancing, request the channel of the model with a request according to the weight randomly. Channels without weights automatically fall back to round_robin polling load balancing.
185
+ AUTO_RETRY: true
186
+
187
+ preferences: # Global configuration
188
+ model_timeout: # Model timeout, in seconds, default 100 seconds, optional
189
+ gpt-4o: 10 # Model gpt-4o timeout is 10 seconds, gpt-4o is the model name, when requesting models like gpt-4o-2024-08-06, the timeout is also 10 seconds
190
+ claude-3-5-sonnet: 10 # Model claude-3-5-sonnet timeout is 10 seconds, when requesting models like claude-3-5-sonnet-20240620, the timeout is also 10 seconds
191
+ default: 10 # Model does not have a timeout set, use the default timeout of 10 seconds, when requesting a model not in model_timeout, the default timeout is 10 seconds, if default is not set, uni-api will use the default timeout set by the environment variable TIMEOUT, the default timeout is 100 seconds
192
+ o1-mini: 30 # Model o1-mini timeout is 30 seconds, when requesting models starting with o1-mini, the timeout is 30 seconds
193
+ o1-preview: 100 # Model o1-preview timeout is 100 seconds, when requesting models starting with o1-preview, the timeout is 100 seconds
194
+ cooldown_period: 300 # Channel cooldown time, in seconds, default 300 seconds, optional. When a model request fails, the channel will be automatically excluded and cooled down for a period of time, and will not request the channel again. After the cooldown time ends, the model will be automatically restored until the request fails again, and it will be cooled down again. When cooldown_period is set to 0, the cooling mechanism is not enabled.
195
+ rate_limit: 999999/min # uni-api global rate limit, in times/minute, supports multiple frequency constraints, such as: 15/min,10/day. Default 999999/min, optional.
196
+ keepalive_interval: # Heartbeat interval, in seconds, default 80 seconds, optional. Suitable for when uni-api is hosted on cloudflare and uses inference models.
197
+ gemini-2.5-pro: 50 # Model gemini-2.5-pro heartbeat interval is 50 seconds, this value must be less than the model_timeout set timeout, otherwise it will be ignored.
198
+ error_triggers: # Error triggers, when the message returned by the model contains any of the strings in the error_triggers, the channel will return an error. Optional
199
+ - The bot's usage is covered by the developer
200
+ - process this request due to overload or policy
201
+ proxy: socks5://[username]:[password]@[ip]:[port] # Proxy address, optional.
202
+ ```
203
+
204
+ Mount the configuration file and start the uni-api docker container:
205
+
206
+ ```bash
207
+ docker run --user root -p 8001:8000 --name uni-api -dit \
208
+ -v ./api.yaml:/home/api.yaml \
209
+ yym68686/uni-api:latest
210
+ ```
211
+
212
+ ### Method two: Start uni-api using the `CONFIG_URL` environment variable
213
+
214
+ After writing the configuration file according to method one, upload it to the cloud disk, get the file's direct link, and then use the `CONFIG_URL` environment variable to start the uni-api docker container:
215
+
216
+ ```bash
217
+ docker run --user root -p 8001:8000 --name uni-api -dit \
218
+ -e CONFIG_URL=http://file_url/api.yaml \
219
+ yym68686/uni-api:latest
220
+ ```
221
+
222
+ ## Environment variable
223
+
224
+ - CONFIG_URL: The download address of the configuration file, which can be a local file or a remote file, optional
225
+ - TIMEOUT: Request timeout, default is 100 seconds. The timeout can control the time needed to switch to the next channel when one channel does not respond. Optional
226
+ - DISABLE_DATABASE: Whether to disable the database, default is false, optional
227
+
228
+ ## Koyeb remote deployment
229
+
230
+ Click the button below to automatically use the built uni-api docker image to deploy:
231
+
232
+ [![Deploy to Koyeb](https://www.koyeb.com/static/images/deploy/button.svg)](https://app.koyeb.com/deploy?name=uni-api&type=docker&image=docker.io%2Fyym68686%2Funi-api%3Alatest&instance_type=free&regions=was&instances_min=0&env%5BCONFIG_URL%5D=)
233
+
234
+ There are two ways to let Koyeb read the configuration file, choose one of them:
235
+
236
+ 1. Fill in the environment variable `CONFIG_URL` with the direct link of the configuration file
237
+
238
+ 2. Paste the api.yaml file content, if you paste the api.yaml file content directly into the Koyeb environment variable setting file, after pasting the text in the text box, enter the api.yaml path as `/home/api.yaml` in the path field.
239
+
240
+ Then click the Deploy button.
241
+
242
+ ## Ubuntu deployment
243
+
244
+ In the warehouse Releases, find the latest version of the corresponding binary file, for example, a file named uni-api-linux-x86_64-0.0.99.pex. Download the binary file on the server and run it:
245
+
246
+ ```bash
247
+ wget https://github.com/yym68686/uni-api/releases/download/v0.0.99/uni-api-linux-x86_64-0.0.99.pex
248
+ chmod +x uni-api-linux-x86_64-0.0.99.pex
249
+ ./uni-api-linux-x86_64-0.0.99.pex
250
+ ```
251
+
252
+ ## Serv00 Remote Deployment (FreeBSD 14.0)
253
+
254
+ First, log in to the panel, in Additional services click on the tab Run your own applications to enable the option to run your own programs, then go to the panel Port reservation to randomly open a port.
255
+
256
+ If you don't have your own domain name, go to the panel WWW websites and delete the default domain name provided. Then create a new domain with the Domain being the one you just deleted. After clicking Advanced settings, set the Website type to Proxy domain, and the Proxy port should point to the port you just opened. Do not select Use HTTPS.
257
+
258
+ ssh login to the serv00 server, execute the following command:
259
+
260
+ ```bash
261
+ git clone --depth 1 -b main --quiet https://github.com/yym68686/uni-api.git
262
+ cd uni-api
263
+ python -m venv uni-api
264
+ tmux new -A -s uni-api
265
+ source uni-api/bin/activate
266
+ export CFLAGS="-I/usr/local/include"
267
+ export CXXFLAGS="-I/usr/local/include"
268
+ export CC=gcc
269
+ export CXX=g++
270
+ export MAX_CONCURRENCY=1
271
+ export CPUCOUNT=1
272
+ export MAKEFLAGS="-j1"
273
+ CMAKE_BUILD_PARALLEL_LEVEL=1 cpuset -l 0 pip install -vv -r requirements.txt
274
+ cpuset -l 0 pip install -r -vv requirements.txt
275
+ ```
276
+
277
+ ctrl+b d to exit tmux, wait a few hours for the installation to complete, and after the installation is complete, execute the following command:
278
+
279
+ ```bash
280
+ tmux new -A -s uni-api
281
+ source uni-api/bin/activate
282
+ export CONFIG_URL=http://file_url/api.yaml
283
+ export DISABLE_DATABASE=true
284
+ # Modify the port, xxx is the port, modify it yourself, corresponding to the port opened in the panel Port reservation
285
+ sed -i '' 's/port=8000/port=xxx/' main.py
286
+ sed -i '' 's/reload=True/reload=False/' main.py
287
+ python main.py
288
+ ```
289
+
290
+ Use ctrl+b d to exit tmux, allowing the program to run in the background. At this point, you can use uni-api in other chat clients. curl test script:
291
+
292
+ ```bash
293
+ curl -X POST https://xxx.serv00.net/v1/chat/completions \
294
+ -H 'Content-Type: application/json' \
295
+ -H 'Authorization: Bearer sk-xxx' \
296
+ -d '{"model": "gpt-4o","messages": [{"role": "user","content": "Hello"}]}'
297
+ ```
298
+
299
+ Reference document:
300
+
301
+ https://docs.serv00.com/Python/
302
+
303
+ https://linux.do/t/topic/201181
304
+
305
+ https://linux.do/t/topic/218738
306
+
307
+ ## Docker local deployment
308
+
309
+ Start the container
310
+
311
+ ```bash
312
+ docker run --user root -p 8001:8000 --name uni-api -dit \
313
+ -e CONFIG_URL=http://file_url/api.yaml \ # If the local configuration file has already been mounted, there is no need to set CONFIG_URL
314
+ -v ./api.yaml:/home/api.yaml \ # If CONFIG_URL is already set, there is no need to mount the configuration file
315
+ -v ./uniapi_db:/home/data \ # If you do not want to save statistical data, there is no need to mount this folder
316
+ yym68686/uni-api:latest
317
+ ```
318
+
319
+ Or if you want to use Docker Compose, here is a docker-compose.yml example:
320
+
321
+ ```yaml
322
+ services:
323
+ uni-api:
324
+ container_name: uni-api
325
+ image: yym68686/uni-api:latest
326
+ environment:
327
+ - CONFIG_URL=http://file_url/api.yaml # If a local configuration file is already mounted, there is no need to set CONFIG_URL
328
+ ports:
329
+ - 8001:8000
330
+ volumes:
331
+ - ./api.yaml:/home/api.yaml # If CONFIG_URL is already set, there is no need to mount the configuration file
332
+ - ./uniapi_db:/home/data # If you do not want to save statistical data, there is no need to mount this folder
333
+ ```
334
+
335
+ CONFIG_URL is the URL of the remote configuration file that can be automatically downloaded. For example, if you are not comfortable modifying the configuration file on a certain platform, you can upload the configuration file to a hosting service and provide a direct link to uni-api to download, which is the CONFIG_URL. If you are using a local mounted configuration file, there is no need to set CONFIG_URL. CONFIG_URL is used when it is not convenient to mount the configuration file.
336
+
337
+ Run Docker Compose container in the background
338
+
339
+ ```bash
340
+ docker-compose pull
341
+ docker-compose up -d
342
+ ```
343
+
344
+ Docker build
345
+
346
+ ```bash
347
+ docker build --no-cache -t uni-api:latest -f Dockerfile --platform linux/amd64 .
348
+ docker tag uni-api:latest yym68686/uni-api:latest
349
+ docker push yym68686/uni-api:latest
350
+ ```
351
+
352
+ One-Click Restart Docker Image
353
+
354
+ ```bash
355
+ set -eu
356
+ docker pull yym68686/uni-api:latest
357
+ docker rm -f uni-api
358
+ docker run --user root -p 8001:8000 -dit --name uni-api \
359
+ -e CONFIG_URL=http://file_url/api.yaml \
360
+ -v ./api.yaml:/home/api.yaml \
361
+ -v ./uniapi_db:/home/data \
362
+ yym68686/uni-api:latest
363
+ docker logs -f uni-api
364
+ ```
365
+
366
+ RESTful curl test
367
+
368
+ ```bash
369
+ curl -X POST http://127.0.0.1:8000/v1/chat/completions \
370
+ -H "Content-Type: application/json" \
371
+ -H "Authorization: Bearer ${API}" \
372
+ -d '{"model": "gpt-4o","messages": [{"role": "user", "content": "Hello"}],"stream": true}'
373
+ ```
374
+
375
+ pex linux packaging:
376
+
377
+ ```bash
378
+ VERSION=$(cat VERSION)
379
+ pex -D . -r requirements.txt \
380
+ -c uvicorn \
381
+ --inject-args 'main:app --host 0.0.0.0 --port 8000' \
382
+ --platform linux_x86_64-cp-3.10.12-cp310 \
383
+ --interpreter-constraint '==3.10.*' \
384
+ --no-strip-pex-env \
385
+ -o uni-api-linux-x86_64-${VERSION}.pex
386
+ ```
387
+
388
+ macOS packaging:
389
+
390
+ ```bash
391
+ VERSION=$(cat VERSION)
392
+ pex -r requirements.txt \
393
+ -c uvicorn \
394
+ --inject-args 'main:app --host 0.0.0.0 --port 8000' \
395
+ -o uni-api-macos-arm64-${VERSION}.pex
396
+ ```
397
+
398
+ ## HuggingFace Space Remote Deployment
399
+
400
+ WARNING: Please be aware of the risk of key leakage in remote deployments. Do not abuse the service to avoid account suspension.
401
+
402
+ The Space repository requires three files: `Dockerfile`, `README.md`, and `entrypoint.sh`.
403
+ To run the program, you also need api.yaml (I'll use the example of storing it entirely in secrets, but you can also implement it via HTTP download). Access matching, model and channel configurations are all in the configuration file.
404
+
405
+ Operation Steps:
406
+
407
+ 1. Visit https://huggingface.co/new-space to create a new space. It should be a public repository; the open source license/name/description can be filled as desired.
408
+
409
+ 2. Visit your space's files page at https://huggingface.co/spaces/your-name/your-space-name/tree/main and upload the three files (`Dockerfile`, `README.md`, `entrypoint.sh`).
410
+
411
+ 3. Visit your space's settings page at https://huggingface.co/spaces/your-name/your-space-name/settings, find the Secrets section and create a new secret called `API_YAML_CONTENT` (note the uppercase). Write your api.yaml locally, then copy it directly into the secret field using UTF-8 encoding.
412
+
413
+ 4. Still in settings, find Factory rebuild and let it rebuild. If you modify secrets or files, or manually restart the Space, it may get stuck with no logs. Use this method to resolve such issues.
414
+
415
+ 5. In the upper right corner of the settings page, find the three-dot button and select "Embed this Space" to get the public link for your Space. The format is https://(your-name)-(your-space-name).hf.space (remove the parentheses).
416
+
417
+ Related File Codes:
418
+
419
+ ```Dockerfile
420
+ # Dockerfile,del this line
421
+ # Use the uni-api official image
422
+ FROM yym68686/uni-api:latest
423
+ # Create data directory and set permissions
424
+ RUN mkdir -p /data && chown -R 1000:1000 /data
425
+ # Set up user and working directory
426
+ RUN useradd -m -u 1000 user
427
+ USER user
428
+ ENV HOME=/home/user \
429
+ PATH=/home/user/.local/bin:$PATH \
430
+ DISABLE_DATABASE=true
431
+ # Copy entrypoint script
432
+ COPY --chown=user entrypoint.sh /home/user/entrypoint.sh
433
+ RUN chmod +x /home/user/entrypoint.sh
434
+ # Ensure /home directory is writable (this is important!)
435
+ USER root
436
+ RUN chmod 777 /home
437
+ USER user
438
+ # Set working directory
439
+ WORKDIR /home/user
440
+ # Entry point
441
+ ENTRYPOINT ["/home/user/entrypoint.sh"]
442
+ ```
443
+
444
+ ```markdown
445
  ---
446
+ title: Uni API
447
  emoji: 🌍
448
+ colorFrom: gray
449
+ colorTo: yellow
450
  sdk: docker
451
+ app_port: 8000
452
  pinned: false
453
+ license: gpl-3.0
454
  ---
455
+ ```
456
+
457
+ ```shell
458
+ # entrypoint.sh,del this line
459
+ #!/bin/sh
460
+ set -e
461
+ CONFIG_FILE_PATH="/home/api.yaml" # Note this is changed to /home/api.yaml
462
+ echo "DEBUG: Entrypoint script started."
463
+ # Check if Secret exists
464
+ if [ -z "$API_YAML_CONTENT" ]; then
465
+ echo "ERROR: Secret 'API_YAML_CONTENT' does not exist or is empty. Exiting."
466
+ exit 1
467
+ else
468
+ echo "DEBUG: API_YAML_CONTENT secret found. Preparing to write..."
469
+ printf '%s\n' "$API_YAML_CONTENT" > "$CONFIG_FILE_PATH"
470
+ echo "DEBUG: Attempted to write to $CONFIG_FILE_PATH."
471
+
472
+ if [ -f "$CONFIG_FILE_PATH" ]; then
473
+ echo "DEBUG: File $CONFIG_FILE_PATH created successfully. Size: $(wc -c < "$CONFIG_FILE_PATH") bytes."
474
+ # Display the first few lines for debugging (be careful not to display sensitive information)
475
+ echo "DEBUG: First few lines (without sensitive info):"
476
+ head -n 3 "$CONFIG_FILE_PATH" | grep -v "api:" | grep -v "password"
477
+ else
478
+ echo "ERROR: File $CONFIG_FILE_PATH was NOT created."
479
+ exit 1
480
+ fi
481
+ fi
482
+ echo "DEBUG: About to execute python main.py..."
483
+ # No need to use the --config parameter as the program has a default path
484
+ cd /home
485
+ exec python main.py "$@"
486
+ ```
487
+
488
+ ## uni-api frontend deployment
489
+
490
+ The frontend of uni-api can be deployed by yourself, address: https://github.com/yym68686/uni-api-web
491
+
492
+ You can also use the frontend I deployed, address: https://uni-api-web.pages.dev/
493
+
494
+ ## Sponsors
495
+
496
+ We thank the following sponsors for their support:
497
+ <!-- ¥2050 -->
498
+ - @PowerHunter: ¥2000
499
+ - @IM4O4: ¥100
500
+ - @ioi:¥50
501
+
502
+ ## How to sponsor us
503
+
504
+ If you would like to support our project, you can sponsor us in the following ways:
505
+
506
+ 1. [PayPal](https://www.paypal.me/yym68686)
507
+
508
+ 2. [USDT-TRC20](https://pb.yym68686.top/~USDT-TRC20), USDT-TRC20 wallet address: `TLFbqSv5pDu5he43mVmK1dNx7yBMFeN7d8`
509
+
510
+ 3. [WeChat](https://pb.yym68686.top/~wechat)
511
+
512
+ 4. [Alipay](https://pb.yym68686.top/~alipay)
513
+
514
+ Thank you for your support!
515
+
516
+ ## FAQ
517
+
518
+ - Why does the error `Error processing request or performing moral check: 404: No matching model found` always appear?
519
+
520
+ Setting ENABLE_MODERATION to false will fix this issue. When ENABLE_MODERATION is true, the API must be able to use the text-moderation-latest model, and if you have not provided text-moderation-latest in the provider model settings, an error will occur indicating that the model cannot be found.
521
+
522
+ - How to prioritize requests for a specific channel, how to set the priority of a channel?
523
+
524
+ Directly set the channel order in the api_keys. No other settings are required. Sample configuration file:
525
+
526
+ ```yaml
527
+ providers:
528
+ - provider: ai1
529
+ base_url: https://xxx/v1/chat/completions
530
+ api: sk-xxx
531
+
532
+ - provider: ai2
533
+ base_url: https://xxx/v1/chat/completions
534
+ api: sk-xxx
535
+
536
+ api_keys:
537
+ - api: sk-1234
538
+ model:
539
+ - ai2/*
540
+ - ai1/*
541
+ ```
542
+
543
+ In this way, request ai2 first, and if it fails, request ai1.
544
+
545
+ - What is the behavior behind various scheduling algorithms? For example, fixed_priority, weighted_round_robin, lottery, random, round_robin?
546
+
547
+ All scheduling algorithms need to be enabled by setting api_keys.(api).preferences.SCHEDULING_ALGORITHM in the configuration file to any of the values: fixed_priority, weighted_round_robin, lottery, random, round_robin.
548
+
549
+ 1. fixed_priority: Fixed priority scheduling. All requests are always executed by the channel of the model that first has a user request. In case of an error, it will switch to the next channel. This is the default scheduling algorithm.
550
+
551
+ 2. weighted_round_robin: Weighted round-robin load balancing, requests channels with the user's requested model according to the weight order set in the configuration file api_keys.(api).model.
552
+
553
+ 3. lottery: Draw round-robin load balancing, randomly request the channel of the model with user requests according to the weight set in the configuration file api_keys.(api).model.
554
+
555
+ 4. round_robin: Round-robin load balancing, requests the channel that owns the model requested by the user according to the configuration order in the configuration file api_keys.(api).model. You can check the previous question on how to set the priority of channels.
556
+
557
+ - How should the base_url be filled in correctly?
558
+
559
+ Except for some special channels shown in the advanced configuration, all OpenAI format providers need to fill in the base_url completely, which means the base_url must end with /v1/chat/completions. If you are using GitHub models, the base_url should be filled in as https://models.inference.ai.azure.com/chat/completions, not Azure's URL.
560
+
561
+ For Azure channels, the base_url is compatible with the following formats: https://your-endpoint.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview and https://your-endpoint.services.ai.azure.com/models/chat/completions, https://your-endpoint.openai.azure.com, it is recommended to use the first format. If api-version is not explicitly specified, the default is 2024-10-21.
562
+
563
+ - How does the model timeout time work? What is the priority of the channel-level timeout setting and the global model timeout setting?
564
+
565
+ The channel-level timeout setting has higher priority than the global model timeout setting. The priority order is: channel-level model timeout setting > channel-level default timeout setting > global model timeout setting > global default timeout setting > environment variable TIMEOUT.
566
+
567
+ By adjusting the model timeout time, you can avoid the error of some channels timing out. If you encounter the error `{'error': '500', 'details': 'fetch_response_stream Read Response Timeout'}`, please try to increase the model timeout time.
568
+
569
+ - How does api_key_rate_limit work? How do I set the same rate limit for multiple models?
570
+
571
+ If you want to set the same frequency limit for the four models gemini-1.5-pro-latest, gemini-1.5-pro, gemini-1.5-pro-001, gemini-1.5-pro-002 simultaneously, you can set it like this:
572
+
573
+ ```yaml
574
+ api_key_rate_limit:
575
+ gemini-1.5-pro: 1000/min
576
+ ```
577
+
578
+ This will match all models containing the gemini-1.5-pro string. The frequency limit for these four models, gemini-1.5-pro-latest, gemini-1.5-pro, gemini-1.5-pro-001, gemini-1.5-pro-002, will all be set to 1000/min. The logic for configuring the api_key_rate_limit field is as follows, here is a sample configuration file:
579
+
580
+ ```yaml
581
+ api_key_rate_limit:
582
+ gemini-1.5-pro: 1000/min
583
+ gemini-1.5-pro-002: 500/min
584
+ ```
585
+
586
+ At this time, if there is a request using the model gemini-1.5-pro-002.
587
+
588
+ First, the uni-api will attempt to precisely match the model in the api_key_rate_limit. If the rate limit for gemini-1.5-pro-002 is set, then the rate limit for gemini-1.5-pro-002 is 500/min. If the requested model at this time is not gemini-1.5-pro-002, but gemini-1.5-pro-latest, since the api_key_rate_limit does not have a rate limit set for gemini-1.5-pro-latest, it will look for any model with the same prefix as gemini-1.5-pro-latest that has been set, thus the rate limit for gemini-1.5-pro-latest will be set to 1000/min.
589
+
590
+ - I want to set channel 1 and channel 2 to random round-robin, and uni-api will request channel 3 after channel 1 and channel 2 failure. How do I set it?
591
+
592
+ uni-api supports api key as a channel, and can use this feature to manage channels by grouping them.
593
+
594
+ ```yaml
595
+ api_keys:
596
+ - api: sk-xxx1
597
+ model:
598
+ - sk-xxx2/* # channel 1 2 use random round-robin, request channel 3 after failure
599
+ - aws/* # channel 3
600
+ preferences:
601
+ SCHEDULING_ALGORITHM: fixed_priority # always request api key: sk-xxx2 first, then request channel 3 after failure
602
+
603
+ - api: sk-xxx2
604
+ model:
605
+ - anthropic/claude-3-7-sonnet # channel 1
606
+ - openrouter/claude-3-7-sonnet # channel 2
607
+ preferences:
608
+ SCHEDULING_ALGORITHM: random # channel 1 2 use random round-robin
609
+ ```
610
+
611
+ - I want to use Cloudflare AI Gateway, how should I fill in the base_url?
612
+
613
+ For gemini channels, the base_url for Cloudflare AI Gateway should be filled in as https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_name}/google-ai-studio/v1beta/openai/chat/completions , where {account_id} and {gateway_name} need to be replaced with your Cloudflare account ID and Gateway name.
614
+
615
+ For Vertex channels, the base_url for Cloudflare AI Gateway should be filled in as https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_name}/google-vertex-ai , where {account_id} and {gateway_name} need to be replaced with your Cloudflare account ID and Gateway name.
616
+
617
+ ## ⭐ Star History
618
 
619
+ <a href="https://github.com/yym68686/uni-api/stargazers">
620
+ <img width="500" alt="Star History Chart" src="https://api.star-history.com/svg?repos=yym68686/uni-api&type=Date">
621
+ </a>
README_CN.md ADDED
@@ -0,0 +1,622 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # uni-api
2
+
3
+ <p align="center">
4
+ <a href="https://t.me/uni_api">
5
+ <img src="https://img.shields.io/badge/Join Telegram Group-blue?&logo=telegram">
6
+ </a>
7
+ <a href="https://hub.docker.com/repository/docker/yym68686/uni-api">
8
+ <img src="https://img.shields.io/docker/pulls/yym68686/uni-api?color=blue" alt="docker pull">
9
+ </a>
10
+ </p>
11
+
12
+ [英文](./README.md) | [中文](./README_CN.md)
13
+
14
+ ## 介绍
15
+
16
+ 如果个人使用的话,one/new-api 过于复杂,有很多个人不需要使用的商用功能,如果你不想要复杂的前端界面,又想要支持的模型多一点,可以试试 uni-api。这是一个统一管理大模型 API 的项目,可以通过一个统一的API 接口调用多种不同提供商的服务,统一转换为 OpenAI 格式,支持负载均衡。目前支持的后端服务有:OpenAI、Anthropic、Gemini、Vertex、Azure、xai、Cohere、Groq、Cloudflare、OpenRouter 等。
17
+
18
+ ## ✨ 特性
19
+
20
+ - 无前端,纯配置文件配置 API 渠道。只要写一个文件就能运行起一个属于自己的 API 站,文档有详细的配置指南,小白友好。
21
+ - 统一管理多个后端服务,支持 OpenAI、Deepseek、OpenRouter 等其他 API 是 OpenAI 格式的提供商。支持 OpenAI Dalle-3 图像生成。
22
+ - 同时支持 Anthropic、Gemini、Vertex AI、Azure、xai、Cohere、Groq、Cloudflare。Vertex 同时支持 Claude 和 Gemini API。
23
+ - 支持 OpenAI、 Anthropic、Gemini、Vertex、Azure、xai 原生 tool use 函数调用。
24
+ - 支持 OpenAI、Anthropic、Gemini、Vertex、Azure、xai 原生识图 API。
25
+ - 支持四种负载均衡。
26
+ 1. 支持渠道级加权负载均衡,可以根据不同的渠道权重分配请求。默认不开启,需要配置渠道权重。
27
+ 2. 支持 Vertex 区域级负载均衡,支持 Vertex 高并发,最高可将 Gemini,Claude 并发提高 (API数量 * 区域数量) 倍。自动开启不需要额外配置。
28
+ 3. 除了 Vertex 区域级负载均衡,所有 API 均支持渠道级顺序负载均衡,提高沉浸式翻译体验。默认不开启,需要配置 `SCHEDULING_ALGORITHM` 为 `round_robin`。
29
+ 4. 支持单个渠道多个 API Key 自动开启 API key 级别的轮训负载均衡。
30
+ - 支持自动重试,当一个 API 渠道响应失败时,自动重试下一个 API 渠道。
31
+ - 支持渠道冷却,当一个 API 渠道响应失败时,会自动将该渠道排除冷却一段时间,不再请求该渠道,冷却时间结束后,会自动将该模型恢复,直到再次请求失败,会重新冷却。
32
+ - 支持细粒度的模型超时时间设置,可以为每个模型设置不同的超时时间。
33
+ - 支持细粒度的权限控制。支持使用通配符设置 API key 可用渠道的特定模型。
34
+ - 支持限流,可以设置每分钟最多请求次数,可以设置为整数,如 2/min,2 次每分钟、5/hour,5 次每小时、10/day,10 次每天,10/month,10 次每月,10/year,10 次每年。默认60/min。
35
+ - 支持多个标准 OpenAI 格式的接口:`/v1/chat/completions`,`/v1/images/generations`,`/v1/audio/transcriptions`,`/v1/moderations`,`/v1/models`。
36
+ - 支持 OpenAI moderation 道德审查,可以对用户的消息进行道德审查,如果发现不当的消息,会返回错误信息。降低后台 API 被提供商封禁的风险。
37
+
38
+ ## 使用方法
39
+
40
+ 启动 uni-api 必须使用配置文件,有两种方式可以启动配置文件:
41
+
42
+ 1. 第一种是使用 `CONFIG_URL` 环境变量填写配置文件 URL,uni-api启动时会自动下载。
43
+ 2. 第二种就是挂载名为 `api.yaml` 的配置文件到容器内。
44
+
45
+ ### 方法一:挂载 `api.yaml` 配置文件启动 uni-api
46
+
47
+ 必须事先填写完成配置文件才能启动 `uni-api`,必须使用名为 `api.yaml` 的配置文件才能启动 `uni-api`,可以配置多个模型,每个模型可以配置多个后端服务,支持负载均衡。下面是最小可运行的 `api.yaml` 配置文件的示例:
48
+
49
+ ```yaml
50
+ providers:
51
+ - provider: provider_name # 服务提供商名称, 如 openai、anthropic、gemini、openrouter,随便取名字,必填
52
+ base_url: https://api.your.com/v1/chat/completions # 后端服务的API地址,必填
53
+ api: sk-YgS6GTi0b4bEabc4C # 提供商的API Key,必填,自动使用 base_url 和 api 通过 /v1/models 端点获取可用的所有模型。
54
+ # 这里可以配置多个提供商,每个提供商可以配置多个 API Key,每个提供商可以配置多个模型。
55
+ api_keys:
56
+ - api: sk-Pkj60Yf8JFWxfgRmXQFWyGtWUddGZnmi3KlvowmRWpWpQxx # API Key,用户请求 uni-api 需要 API key,必填
57
+ # 该 API Key 可以使用所有模型,即可以使用 providers 下面设置的所有渠道里面的所有模型,不需要一个个添加可用渠道。
58
+ ```
59
+
60
+ `api.yaml` 详细的高级配置:
61
+
62
+ ```yaml
63
+ providers:
64
+ - provider: provider_name # 服务提供商名称, 如 openai、anthropic、gemini、openrouter,随便取名字,必填
65
+ base_url: https://api.your.com/v1/chat/completions # 后端服务的API地址,必填
66
+ api: sk-YgS6GTi0b4bEabc4C # 提供商的API Key,必填
67
+ model: # 选填,如果不配置 model,会自动通过 base_url 和 api 通过 /v1/models 端点获取可用的所有模型。
68
+ - gpt-4o # 可以使用的模型名称,必填
69
+ - claude-3-5-sonnet-20240620: claude-3-5-sonnet # 重命名模型,claude-3-5-sonnet-20240620 是服务商的模型名称,claude-3-5-sonnet 是重命名后的名字,可以使用简洁的名字代替原来复杂的名称,选填
70
+ - dall-e-3
71
+
72
+ - provider: anthropic
73
+ base_url: https://api.anthropic.com/v1/messages
74
+ api: # 支持多个 API Key,多个 key 自动开启轮训负载均衡,至少一个 key,必填
75
+ - sk-ant-api03-bNnAOJyA-xQw_twAA
76
+ - sk-ant-api02-bNnxxxx
77
+ model:
78
+ - claude-3-7-sonnet-20240620: claude-3-7-sonnet # 重命名模型,claude-3-7-sonnet-20240620 是服务商的模型名称,claude-3-7-sonnet 是重命名后的名字,可以使用简洁的名字代替原来复杂的名称,选填
79
+ - claude-3-7-sonnet-20250219: claude-3-7-sonnet-think # 重命名模型,claude-3-7-sonnet-20250219 是服务商的模型名称,claude-3-7-sonnet-think 是重命名后的名字,可以使用简洁的名字代替原来复杂的名称,如果重命名后的名字里面有think,则自动转换为 claude 思考模型,默认思考 token 限制为 4096。选填
80
+ tools: true # 是否支持工具,如生成代码、生成文档等,默认是 true,选填
81
+
82
+ - provider: gemini
83
+ base_url: https://generativelanguage.googleapis.com/v1beta # base_url 支持 v1beta/v1, 仅供 Gemini 模型使用,必填
84
+ api: # 支持多个 API Key,多个 key 自动开启轮训负载均衡,至少一个 key,必填
85
+ - AIzaSyAN2k6IRdgw123
86
+ - AIzaSyAN2k6IRdgw456
87
+ - AIzaSyAN2k6IRdgw789
88
+ model:
89
+ - gemini-1.5-pro
90
+ - gemini-1.5-flash-exp-0827: gemini-1.5-flash # 重命名后,原来的模型名字 gemini-1.5-flash-exp-0827 无法使用,如果要使用原来的名字,可以在 model 中添加原来的名字,只要加上下面一行就可以使用原来的名字了
91
+ - gemini-1.5-flash-exp-0827 # 加上这一行,gemini-1.5-flash-exp-0827 和 gemini-1.5-flash 都可以被请求
92
+ - gemini-1.5-pro: gemini-1.5-pro-search # 支持以 -search 后缀重命名模型启用搜索,使用 gemini-1.5-pro-search 模型请求 uni-api 时,表示 gemini-1.5-pro 模型自动使用 Google 官方搜索工具,支持全部 1.5/2.0 系列模型。
93
+ tools: true
94
+ preferences:
95
+ api_key_rate_limit: 15/min # 每个 API Key 每分钟最多请求次数,选填。默认为 999999/min。支持多个频率约束条件:15/min,10/day
96
+ # api_key_rate_limit: # 可以为每个模型设置不同的频率限制
97
+ # gemini-1.5-flash: 15/min,1500/day
98
+ # gemini-1.5-pro: 2/min,50/day
99
+ # default: 4/min # 如果模型没有设置频率限制,使用 default 的频率限制
100
+ api_key_cooldown_period: 60 # 每个 API Key 遭遇 429 错误后的冷却时间,单位为秒,选填。默认为 0 秒, 当设置为 0 秒时,不启用冷却机制。当存在多个 API key 时才会生效。
101
+ api_key_schedule_algorithm: round_robin # 设置多个 API Key 的请求顺序,选填。默认为 round_robin,可选值有:round_robin,random,fixed_priority。当存在多个 API key 时才会生效。round_robin 是轮询负载均衡,random 是随机负载均衡,fixed_priority 是固定优先级调度,永远使用第一个可用的 API key。
102
+ model_timeout: # 模型超时时间,单位为秒,默认 100 秒,选填
103
+ gemini-1.5-pro: 10 # 模型 gemini-1.5-pro 的超时时间为 10 秒
104
+ gemini-1.5-flash: 10 # 模型 gemini-1.5-flash 的超时时间为 10 秒
105
+ default: 10 # 模型没有设置超时时间,使用默认的超时时间 10 秒,当请求的不在 model_timeout 里面的模型时,超时时间默认是 10 秒,不设置 default,uni-api 会使用全局配置的模型超时时间。
106
+ keepalive_interval: # 心跳间隔,单位为秒,默认 80 秒,选填。适合当 uni-api 域名托管在 cloudflare 并使用推理模型时使用。优先级高于全局配置的 keepalive_interval。
107
+ gemini-2.5-pro: 50 # 模型 gemini-2.5-pro 的心跳间隔为 50 秒,此数值必须小于 model_timeout 设置的超时时间,否则忽略此设置。
108
+ proxy: socks5://[用户名]:[密码]@[IP地址]:[端口] # 代理地址,选填。支持 socks5 和 http 代理,默认不使用代理。
109
+ headers: # 额外附加自定义HTTP请求头,选填。
110
+ Custom-Header-1: Value-1
111
+ Custom-Header-2: Value-2
112
+
113
+ - provider: vertex
114
+ project_id: gen-lang-client-xxxxxxxxxxxxxx # 描述: 您的Google Cloud项目ID。格式: 字符串,通常由小写字母、数字和连字符组成。获取方式: 在Google Cloud Console的项目选择器中可以找到您的项目ID。
115
+ private_key: "-----BEGIN PRIVATE KEY-----\nxxxxx\n-----END PRIVATE" # 描述: Google Cloud Vertex AI服务账号的私钥。格式: 一个 JSON 格式的字符串,包含服务账号的私钥信息。获取方式: 在 Google Cloud Console 中创建服务账号,生成JSON格式的密钥文件,然后将其内容设置为此环境变量的值。
116
+ client_email: xxxxxxxxxx@xxxxxxx.gserviceaccount.com # 描述: Google Cloud Vertex AI 服务账号的电子邮件地址。格式: 通常是形如 "service-account-name@project-id.iam.gserviceaccount.com" 的字符串。获取方式: 在创建服务账号时生成,也可以在 Google Cloud Console 的"IAM与管理"部分查看服务账号详情获得。
117
+ model:
118
+ - gemini-1.5-pro
119
+ - gemini-1.5-flash
120
+ - gemini-1.5-pro: gemini-1.5-pro-search # 仅支持在 vertex Gemini API 中,以 -search 后缀重命名模型后,使用 gemini-1.5-pro-search 模型请求 uni-api 时,表示 gemini-1.5-pro 模型自动使用 Google 官方搜索工具。
121
+ - claude-3-5-sonnet@20240620: claude-3-5-sonnet
122
+ - claude-3-opus@20240229: claude-3-opus
123
+ - claude-3-sonnet@20240229: claude-3-sonnet
124
+ - claude-3-haiku@20240307: claude-3-haiku
125
+ tools: true
126
+ notes: https://xxxxx.com/ # 可以放服务商的网址,备注信息,官方文档,选填
127
+
128
+ - provider: cloudflare
129
+ api: f42b3xxxxxxxxxxq4aoGAh # Cloudflare API Key,必填
130
+ cf_account_id: 8ec0xxxxxxxxxxxxe721 # Cloudflare Account ID,必填
131
+ model:
132
+ - '@cf/meta/llama-3.1-8b-instruct': llama-3.1-8b # 重命名模型,@cf/meta/llama-3.1-8b-instruct 是服务商的原始的模型名称,必须使用引号包裹模型名,否则yaml语法错误,llama-3.1-8b 是重命名后的名字,可以使用简洁的名字代替原来复杂的名称,选填
133
+ - '@cf/meta/llama-3.1-8b-instruct' # 必须使用引号包裹模型名,否则yaml语法错误
134
+
135
+ - provider: azure
136
+ base_url: https://your-endpoint.openai.azure.com
137
+ api: your-api-key
138
+ model:
139
+ - gpt-4o
140
+
141
+ - provider: other-provider
142
+ base_url: https://api.xxx.com/v1/messages
143
+ api: sk-bNnAOJyA-xQw_twAA
144
+ model:
145
+ - causallm-35b-beta2ep-q6k: causallm-35b
146
+ - anthropic/claude-3-5-sonnet
147
+ tools: false
148
+ engine: openrouter # 强制使用某个消息格式,目前支持 gpt,claude,gemini,openrouter 原生格式,选填
149
+
150
+ api_keys:
151
+ - api: sk-KjjI60Yf0JFWxfgRmXqFWyGtWUd9GZnmi3KlvowmRWpWpQRo # API Key,用户使用本服务需要 API key,必填
152
+ model: # 该 API Key 可以使用的模型,必填。默认开启渠道级轮询负载均衡,每次请求模型按照 model 配置的顺序依次请求。与 providers 里面原始的渠道顺序无关。因此你可以设置每个 API key 请求顺序不一样。
153
+ - gpt-4o # 可以使用的模型名称,可以使用所有提供商提供的 gpt-4o 模型
154
+ - claude-3-5-sonnet # 可以使用的模型名称,可以使用所有提供商提供的 claude-3-5-sonnet 模型
155
+ - gemini/* # 可以使用的模型名称,仅可以使用名为 gemini 提供商提供的所有模型,其中 gemini 是 provider 名称,* 代表所有模型
156
+ role: admin # 设置 API key 的别名,选填。请求日志会显示该 API key 的别名。如果 role 为 admin,则仅有此 API key 可以请求 v1/stats,/v1/generate-api-key 端点。如果所有 API key 都没有设置 role 为 admin,则默认第一个 API key 为 admin 拥有请求 v1/stats,/v1/generate-api-key 端点的权限。
157
+
158
+ - api: sk-pkhf60Yf0JGyJxgRmXqFQyTgWUd9GZnmi3KlvowmRWpWqrhy
159
+ model:
160
+ - anthropic/claude-3-5-sonnet # 可以使用的模型名称,仅可以使用名为 anthropic 提供商提供的 claude-3-5-sonnet 模型。其他提供商的 claude-3-5-sonnet 模型不可以使用。这种写法不会匹配到other-provider提供的名为anthropic/claude-3-5-sonnet的模型。
161
+ - <anthropic/claude-3-5-sonnet> # 通过在模型名两侧加上尖括号,这样就不会去名为anthropic的渠道下去寻找claude-3-5-sonnet模型,而是将整个 anthropic/claude-3-5-sonnet 作为模型名称。这种写法可以匹配到other-provider提供的名为 anthropic/claude-3-5-sonnet 的模型。但不会匹配到anthropic下面的claude-3-5-sonnet模型。
162
+ - openai-test/text-moderation-latest # 当开启消息道德审查后,可以使用名为 openai-test 渠道下的 text-moderation-latest 模型进行道德审查。
163
+ - sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo/* # 支持将其他 api key 当作渠道
164
+ preferences:
165
+ SCHEDULING_ALGORITHM: fixed_priority # 当 SCHEDULING_ALGORITHM 为 fixed_priority 时,使用固定优先级调度,永远执行第一个拥有请求的模型的渠道。默认开启,SCHEDULING_ALGORITHM 缺省值为 fixed_priority。SCHEDULING_ALGORITHM 可选值有:fixed_priority,round_robin,weighted_round_robin, lottery, random。
166
+ # 当 SCHEDULING_ALGORITHM 为 random 时,使用随机轮训负载均衡,随机请求拥有请求的模型的渠道。
167
+ # 当 SCHEDULING_ALGORITHM 为 round_robin 时,使用轮训负载均衡,按照顺序请求用户使用的模型的渠���。
168
+ AUTO_RETRY: true # 是否自动重试,自动重试下一个提供商,true 为自动重试,false 为不自动重试,默认为 true。也可以设置为数字,表示重试次数。
169
+ rate_limit: 15/min # 支持限流,每分钟最多请求次数,可以设置为整数,如 2/min,2 次每分钟、5/hour,5 次每小时、10/day,10 次每天,10/month,10 次每月,10/year,10 次每年。默认999999/min,选填。支持多个频率约束条件:15/min,10/day
170
+ # rate_limit: # 可以为每个模型设置不同的频率限制
171
+ # gemini-1.5-flash: 15/min,1500/day
172
+ # gemini-1.5-pro: 2/min,50/day
173
+ # default: 4/min # 如果模型没有设置频率限制,使用 default 的频率限制
174
+ ENABLE_MODERATION: true # 是否开启消息道德审查,true 为开启,false 为不开启,默认为 false,当开启后,会对用户的消息进行道德审查,如果发现不当的消息,会返回错误信息。
175
+
176
+ # 渠道级加权负载均衡配置示例
177
+ - api: sk-KjjI60Yd0JFWtxxxxxxxxxxxxxxwmRWpWpQRo
178
+ model:
179
+ - gcp1/*: 5 # 冒号后面就是权重,权重仅支持正整数。
180
+ - gcp2/*: 3 # 数字的大小代表权重,数字越大,请求的概率越大。
181
+ - gcp3/*: 2 # 在该示例中,所有渠道加起来一共有 10 个权重,及 10 个请求里面有 5 个请求会请求 gcp1/* 模型,2 个请求会请求 gcp2/* 模型,3 个请求会请求 gcp3/* 模型。
182
+
183
+ preferences:
184
+ SCHEDULING_ALGORITHM: weighted_round_robin # 仅当 SCHEDULING_ALGORITHM 为 weighted_round_robin 并且上面的渠道如果有权重,会按照加权后的顺序请求。使用加权轮训负载均衡,按照权重顺序请求拥有请求的模型的渠道。当 SCHEDULING_ALGORITHM 为 lottery 时,使用抽奖轮训负载均衡,按照权重随机请求拥有请求的模型的渠道。没设置权重的渠道自动回退到 round_robin 轮训负载均衡。
185
+ AUTO_RETRY: true
186
+
187
+ preferences: # 全局配置
188
+ model_timeout: # 模型超时时间,单位为秒,默认 100 秒,选填
189
+ gpt-4o: 10 # 模型 gpt-4o 的超时时间为 10 秒,gpt-4o 是模型名称,当请求 gpt-4o-2024-08-06 等模型时,超时时间也是 10 秒
190
+ claude-3-5-sonnet: 10 # 模型 claude-3-5-sonnet 的超时时间为 10 秒,当请求 claude-3-5-sonnet-20240620 等模型时,超时时间也是 10 秒
191
+ default: 10 # 模型没有设置超时时间,使用默认的超时时间 10 秒,当请求的不在 model_timeout 里面的模型时,超时时间默认是 10 秒,不设置 default,uni-api 会使用 环境变量 TIMEOUT 设置的默认超时时间,默认超时时间是 100 秒
192
+ o1-mini: 30 # 模型 o1-mini 的超时时间为 30 秒,当请求名字是 o1-mini 开头的模型时,超时时间是 30 秒
193
+ o1-preview: 100 # 模型 o1-preview 的超时时间为 100 秒,当请求名字是 o1-preview 开头的模型时,超时时间是 100 秒
194
+ cooldown_period: 300 # 渠道冷却时间,单位为秒,默认 300 秒,选填。当模型请求失败时,会自动将该渠道排除冷却一段时间,不再请求该渠道,冷却时间结束后,会自动将该模型恢复,直到再次请求失败,会重新冷却。当 cooldown_period 设置为 0 时,不启用冷却机制。
195
+ rate_limit: 999999/min # uni-api 全局速率限制,单位为次数/分钟,支持多个频率约束条件,例如:15/min,10/day。默认 999999/min,选填。
196
+ keepalive_interval: # 心跳间隔,单位为秒,默认 80 秒,选填。适合当 uni-api 域名托管在 cloudflare 并使用推理模型时使用。
197
+ gemini-2.5-pro: 50 # 模型 gemini-2.5-pro 的心跳间隔为 50 秒,此数值必须小于 model_timeout 设置的超时时间,否则忽略此设置。
198
+ error_triggers: # 错误触发器,当模型返回的消息包含错误触发器中的任意一个字符串时,该渠道会自动返回报错。选填
199
+ - The bot's usage is covered by the developer
200
+ - process this request due to overload or policy
201
+ proxy: socks5://[username]:[password]@[ip]:[port] # 全局代理地址,选填。
202
+ ```
203
+
204
+ 挂载配置文件并启动 uni-api docker 容器:
205
+
206
+ ```bash
207
+ docker run --user root -p 8001:8000 --name uni-api -dit \
208
+ -v ./api.yaml:/home/api.yaml \
209
+ yym68686/uni-api:latest
210
+ ```
211
+
212
+ ### 方法二:使用 `CONFIG_URL` 环境变量启动 uni-api
213
+
214
+ 按照方法一写完配置文件后,上传到云端硬盘,获取文件的直链,然后使用 `CONFIG_URL` 环境变量启动 uni-api docker 容器:
215
+
216
+ ```bash
217
+ docker run --user root -p 8001:8000 --name uni-api -dit \
218
+ -e CONFIG_URL=http://file_url/api.yaml \
219
+ yym68686/uni-api:latest
220
+ ```
221
+
222
+ ## 环境变量
223
+
224
+ - CONFIG_URL: 配置文件的下载地址,可以是本地文件,也可以是远程文件,选填
225
+ - TIMEOUT: 请求超时时间,默认为 100 秒,超时时间可以控制当一个渠道没有响应时,切换下一个渠道需要的时间。选填
226
+ - DISABLE_DATABASE: 是否禁用数据库,默认为 false,选填
227
+
228
+ ## Koyeb ��程部署
229
+
230
+ 点击下面的按钮可以自动使用构建好的 uni-api docker 镜像一键部署:
231
+
232
+ [![Deploy to Koyeb](https://www.koyeb.com/static/images/deploy/button.svg)](https://app.koyeb.com/deploy?name=uni-api&type=docker&image=docker.io%2Fyym68686%2Funi-api%3Alatest&instance_type=free&regions=was&instances_min=0&env%5BCONFIG_URL%5D=)
233
+
234
+ 让 Koyeb 读取配置文件有两种方法,选一种即可:
235
+
236
+ 1. 填写环境变量 `CONFIG_URL` 为配置文件的直链
237
+
238
+ 2. 直接粘贴 api.yaml 文件内容,如果直接把 api.yaml 文件内容粘贴到 Koyeb 环境变量设置的 file 里面,其中粘贴到文本框后,在下方 path 输入 api.yaml 路径为 `/home/api.yaml`。
239
+
240
+ 最后点击 Deploy 部署按钮。
241
+
242
+ ## Ubuntu 部署
243
+
244
+ 在仓库 Releases 找到对应的二进制文件最新版本,例如名为 uni-api-linux-x86_64-0.0.99.pex 的文件。在服务器下载二进制文件并运行:
245
+
246
+ ```bash
247
+ wget https://github.com/yym68686/uni-api/releases/download/v0.0.99/uni-api-linux-x86_64-0.0.99.pex
248
+ chmod +x uni-api-linux-x86_64-0.0.99.pex
249
+ ./uni-api-linux-x86_64-0.0.99.pex
250
+ ```
251
+
252
+ ## serv00 远程部署(FreeBSD 14.0)
253
+
254
+ 首先登录面板,Additional services 里面点击选项卡 Run your own applications 开启允许运行自己的程序,然后到面板 Port reservation 去随便开一个端口。
255
+
256
+ 如果没有自己的域名,去面板 WWW websites 删掉默认给的域名,再新建一个域名 Domain 为刚才删掉的域名,点击 Advanced settings 后设置 Website type 为 Proxy 域名,Proxy port 指向你刚才开的端口,不要选中 Use HTTPS。
257
+
258
+ ssh 登陆到 serv00 服务器,执行下面的命令:
259
+
260
+ ```bash
261
+ git clone --depth 1 -b main --quiet https://github.com/yym68686/uni-api.git
262
+ cd uni-api
263
+ python -m venv uni-api
264
+ tmux new -A -s uni-api
265
+ source uni-api/bin/activate
266
+ export CFLAGS="-I/usr/local/include"
267
+ export CXXFLAGS="-I/usr/local/include"
268
+ export CC=gcc
269
+ export CXX=g++
270
+ export MAX_CONCURRENCY=1
271
+ export CPUCOUNT=1
272
+ export MAKEFLAGS="-j1"
273
+ CMAKE_BUILD_PARALLEL_LEVEL=1 cpuset -l 0 pip install -vv -r requirements.txt
274
+ cpuset -l 0 pip install -r -vv requirements.txt
275
+ ```
276
+
277
+ ctrl+b d 退出 tmux 等待几个小时安装完成,安装完成后执行下面的命令:
278
+
279
+ ```bash
280
+ tmux new -A -s uni-api
281
+ source uni-api/bin/activate
282
+ export CONFIG_URL=http://file_url/api.yaml
283
+ export DISABLE_DATABASE=true
284
+ # 修改端口,xxx 为端口,自行修改,对应刚刚在面板 Port reservation 开的端口
285
+ sed -i '' 's/port=8000/port=xxx/' main.py
286
+ sed -i '' 's/reload=True/reload=False/' main.py
287
+ python main.py
288
+ ```
289
+
290
+ 使用 ctrl+b d 退出 tmux,即可让程序后台运行。此时就可以在其他聊天客户端使用 uni-api 了。curl 测试脚本:
291
+
292
+ ```bash
293
+ curl -X POST https://xxx.serv00.net/v1/chat/completions \
294
+ -H 'Content-Type: application/json' \
295
+ -H 'Authorization: Bearer sk-xxx' \
296
+ -d '{"model": "gpt-4o","messages": [{"role": "user","content": "你好"}]}'
297
+ ```
298
+
299
+ 参考文档:
300
+
301
+ https://docs.serv00.com/Python/
302
+
303
+ https://linux.do/t/topic/201181
304
+
305
+ https://linux.do/t/topic/218738
306
+
307
+ ## Docker 本地部署
308
+
309
+ Start the container
310
+
311
+ ```bash
312
+ docker run --user root -p 8001:8000 --name uni-api -dit \
313
+ -e CONFIG_URL=http://file_url/api.yaml \ # 如果已经挂载了本地配置文件,不需要设置 CONFIG_URL
314
+ -v ./api.yaml:/home/api.yaml \ # 如果已经设置 CONFIG_URL,不需要挂载配置文件
315
+ -v ./uniapi_db:/home/data \ # 如果不想保存统计数据,不需要挂载该文件夹
316
+ yym68686/uni-api:latest
317
+ ```
318
+
319
+ Or if you want to use Docker Compose, here is a docker-compose.yml example:
320
+
321
+ ```yaml
322
+ services:
323
+ uni-api:
324
+ container_name: uni-api
325
+ image: yym68686/uni-api:latest
326
+ environment:
327
+ - CONFIG_URL=http://file_url/api.yaml # 如果已经挂载了本地配置文件,不需要设置 CONFIG_URL
328
+ ports:
329
+ - 8001:8000
330
+ volumes:
331
+ - ./api.yaml:/home/api.yaml # 如果已经设置 CONFIG_URL,不需要挂载配置文件
332
+ - ./uniapi_db:/home/data # 如果不想保存统计数据,不需要挂载该文件夹
333
+ ```
334
+
335
+ CONFIG_URL 就是可以自动下载远程的配置文件。比如你在某个平台不方便修改配置文件,可以把配置文件传到某个托管服务,可以提供直链给 uni-api 下载,CONFIG_URL 就是这个直链。如果使用本地挂载的配置文件,不需要设置 CONFIG_URL。CONFIG_URL 是在不方便挂载配置文件的情况下使用。
336
+
337
+ Run Docker Compose container in the background
338
+
339
+ ```bash
340
+ docker-compose pull
341
+ docker-compose up -d
342
+ ```
343
+
344
+ Docker build
345
+
346
+ ```bash
347
+ docker build --no-cache -t uni-api:latest -f Dockerfile --platform linux/amd64 .
348
+ docker tag uni-api:latest yym68686/uni-api:latest
349
+ docker push yym68686/uni-api:latest
350
+ ```
351
+
352
+ One-Click Restart Docker Image
353
+
354
+ ```bash
355
+ set -eu
356
+ docker pull yym68686/uni-api:latest
357
+ docker rm -f uni-api
358
+ docker run --user root -p 8001:8000 -dit --name uni-api \
359
+ -e CONFIG_URL=http://file_url/api.yaml \
360
+ -v ./api.yaml:/home/api.yaml \
361
+ -v ./uniapi_db:/home/data \
362
+ yym68686/uni-api:latest
363
+ docker logs -f uni-api
364
+ ```
365
+
366
+ RESTful curl test
367
+
368
+ ```bash
369
+ curl -X POST http://127.0.0.1:8000/v1/chat/completions \
370
+ -H "Content-Type: application/json" \
371
+ -H "Authorization: Bearer ${API}" \
372
+ -d '{"model": "gpt-4o","messages": [{"role": "user", "content": "Hello"}],"stream": true}'
373
+ ```
374
+
375
+ pex linux 打包:
376
+
377
+ ```bash
378
+ VERSION=$(cat VERSION)
379
+ pex -D . -r requirements.txt \
380
+ -c uvicorn \
381
+ --inject-args 'main:app --host 0.0.0.0 --port 8000' \
382
+ --platform linux_x86_64-cp-3.10.12-cp310 \
383
+ --interpreter-constraint '==3.10.*' \
384
+ --no-strip-pex-env \
385
+ -o uni-api-linux-x86_64-${VERSION}.pex
386
+ ```
387
+
388
+ macos 打包:
389
+
390
+ ```bash
391
+ VERSION=$(cat VERSION)
392
+ pex -r requirements.txt \
393
+ -c uvicorn \
394
+ --inject-args 'main:app --host 0.0.0.0 --port 8000' \
395
+ -o uni-api-macos-arm64-${VERSION}.pex
396
+ ```
397
+
398
+ ## HuggingFace Space 远程部署
399
+
400
+ WARN: 请注意远程部署的密钥泄露风险,请勿滥用服务以避免封号
401
+ Space 仓库需要提供三个文件 `Dockerfile`、`README.md`、`entrypoint.sh`
402
+ 运行程序还需要 api.yaml(我以全量放在机密中为例,也可以HTTP下载的方式实现),访问匹配、模型和渠道配置等均在配置文件中
403
+ 操作步骤
404
+ 1. 访问 https://huggingface.co/new-space 新建一个sapce,要public库,开源协议/名字/描述等随便
405
+ 2. 访问你的space的file,URL是 https://huggingface.co/spaces/your-name/your-space-name/tree/main,把下面三个文件上传(`Dockerfile`、`README.md`、`entrypoint.sh`)
406
+ 3. 访问你的space的setting,URL是 https://huggingface.co/spaces/your-name/your-space-name/settings 找到 Secrets 新建机密 `API_YAML_CONTENT`(注意大写),把你的api.yaml在本地写好后直接复制进去,UTF-8编码
407
+ 4. 继续在设置中,找到 Factory rebuild 让它重新构建,如果你修改机密或者文件或者手动重启Sapce等情况均有可能导致卡住无log,此时就用这个方法解决
408
+ 5. 在设置最右上角有三个点的按钮,找到 Embed this Space 获取Space的公网链接,格式 https://(your-name)-(your-space-name).hf.space 去掉括号
409
+
410
+ 相关的文件代码如下
411
+ ```Dockerfile
412
+ # Dockerfile,记得删除本行
413
+ # 使用uni-api官方镜像
414
+ FROM yym68686/uni-api:latest
415
+
416
+ # 创建数据目录并设置权限
417
+ RUN mkdir -p /data && chown -R 1000:1000 /data
418
+
419
+ # 设置用户和工作目录
420
+ RUN useradd -m -u 1000 user
421
+ USER user
422
+ ENV HOME=/home/user \
423
+ PATH=/home/user/.local/bin:$PATH \
424
+ DISABLE_DATABASE=true
425
+
426
+ # 复制入口点脚本
427
+ COPY --chown=user entrypoint.sh /home/user/entrypoint.sh
428
+ RUN chmod +x /home/user/entrypoint.sh
429
+
430
+ # 确保/home目录可写(这很重要!)
431
+ USER root
432
+ RUN chmod 777 /home
433
+ USER user
434
+
435
+ # 设置工作目录
436
+ WORKDIR /home/user
437
+
438
+ # 入口点
439
+ ENTRYPOINT ["/home/user/entrypoint.sh"]
440
+ ```
441
+
442
+ ```markdown
443
+ # README.md,覆盖掉默认的,记得删除本行
444
+ ---
445
+ title: Uni API
446
+ emoji: 🌍
447
+ colorFrom: gray
448
+ colorTo: yellow
449
+ sdk: docker
450
+ app_port: 8000
451
+ pinned: false
452
+ license: gpl-3.0
453
+ ---
454
+ ```
455
+ ```shell
456
+ # entrypoint.sh,记得删除本行
457
+ #!/bin/sh
458
+ set -e
459
+ CONFIG_FILE_PATH="/home/api.yaml" # 注意这里改成/home/api.yaml
460
+
461
+ echo "DEBUG: Entrypoint script started."
462
+
463
+ # 检查Secret是否存在
464
+ if [ -z "$API_YAML_CONTENT" ]; then
465
+ echo "ERROR: Secret 'API_YAML_CONTENT' is不存在或为空。退出。"
466
+ exit 1
467
+ else
468
+ echo "DEBUG: API_YAML_CONTENT secret found. Preparing to write..."
469
+ printf '%s\n' "$API_YAML_CONTENT" > "$CONFIG_FILE_PATH"
470
+ echo "DEBUG: Attempted to write to $CONFIG_FILE_PATH."
471
+
472
+ if [ -f "$CONFIG_FILE_PATH" ]; then
473
+ echo "DEBUG: File $CONFIG_FILE_PATH created successfully. Size: $(wc -c < "$CONFIG_FILE_PATH") bytes."
474
+ # 显示文件的前几行进行调试(注意不要显示敏感信息)
475
+ echo "DEBUG: First few lines (without sensitive info):"
476
+ head -n 3 "$CONFIG_FILE_PATH" | grep -v "api:" | grep -v "password"
477
+ else
478
+ echo "ERROR: File $CONFIG_FILE_PATH was NOT created."
479
+ exit 1
480
+ fi
481
+ fi
482
+
483
+ echo "DEBUG: About to execute python main.py..."
484
+ # 不需要使用--config参数,因为程序有默认路径
485
+ cd /home
486
+ exec python main.py "$@"
487
+ ```
488
+
489
+ ## uni-api 前端部署
490
+
491
+ uni-api 的 web 前端可以自行部署,地址:https://github.com/yym68686/uni-api-web
492
+
493
+ 也可以使用我提前部署好的前端,地址:https://uni-api-web.pages.dev/
494
+
495
+ ## 赞助商
496
+
497
+ 我们感谢以下赞助商的支持:
498
+ <!-- ¥2050 -->
499
+ - @PowerHunter:¥2000
500
+ - @IM4O4: ¥100
501
+ - @ioi:¥50
502
+
503
+ ## 如何赞助我们
504
+
505
+ 如果您想支持我们的项目,您可以通过以下方式赞助我们:
506
+
507
+ 1. [PayPal](https://www.paypal.me/yym68686)
508
+
509
+ 2. [USDT-TRC20](https://pb.yym68686.top/~USDT-TRC20),USDT-TRC20 钱包地址:`TLFbqSv5pDu5he43mVmK1dNx7yBMFeN7d8`
510
+
511
+ 3. [微信](https://pb.yym68686.top/~wechat)
512
+
513
+ 4. [支付宝](https://pb.yym68686.top/~alipay)
514
+
515
+ 感谢您的支持!
516
+
517
+ ## 常见问题
518
+
519
+ - 为什么总是出现 `Error processing request or performing moral check: 404: No matching model found` 错误?
520
+
521
+ 将 ENABLE_MODERATION 设置为 false 将修复这个问题。当 ENABLE_MODERATION 为 true 时,API 必须能够使用 text-moderation-latest 模型,如果你没有在提供商模型设置里面提供 text-moderation-latest,将会报错找不到模型。
522
+
523
+ - 怎么优先请求某个渠道,怎么设置渠道的优先级?
524
+
525
+ 直接在api_keys里面通过设置渠道顺序即可。不需要做其他设置,示例配置文件:
526
+
527
+ ```yaml
528
+ providers:
529
+ - provider: ai1
530
+ base_url: https://xxx/v1/chat/completions
531
+ api: sk-xxx
532
+
533
+ - provider: ai2
534
+ base_url: https://xxx/v1/chat/completions
535
+ api: sk-xxx
536
+
537
+ api_keys:
538
+ - api: sk-1234
539
+ model:
540
+ - ai2/*
541
+ - ai1/*
542
+ ```
543
+
544
+ 这样设置则先请求 ai2,失败后请求 ai1。
545
+
546
+ - 各种调度算法背后的行为是怎样的?比如 fixed_priority,weighted_round_robin,lottery,random,round_robin?
547
+
548
+ 所有调度算法需要通过在配置文件的 api_keys.(api).preferences.SCHEDULING_ALGORITHM 设置为 fixed_priority,weighted_round_robin,lottery,random,round_robin 中的任意值来开启。
549
+
550
+ 1. fixed_priority:固定优先级调度。所有请求永远执行第一个拥有用户请求的模型的渠道。报错时,会切换下一个渠道。这是默认的调度算法。
551
+
552
+ 2. weighted_round_robin:加权轮训负载均衡,按照配置文件 api_keys.(api).model 设定的权重顺序请求拥有用户请求的模型的渠道。
553
+
554
+ 3. lottery:抽奖轮训负载均衡,按照配置文件 api_keys.(api).model 设置的权重随机请求拥有用户请求的模型的渠道。
555
+
556
+ 4. round_robin:轮训负载均衡,按照配置文件 api_keys.(api).model 的配置顺序请求拥有用户请求的模型的渠道。可以查看上一个问题,如何设置渠道的优先级。
557
+
558
+ - 应该怎么正确填写 base_url?
559
+
560
+ 除了高级配置里面所展示的一些特殊的渠道,所有 OpenAI 格式的提供商需要把 base_url 填完整,也就是说 base_url 必须以 /v1/chat/completions 结尾。如果你使用的 GitHub models,base_url 应该填写为 https://models.inference.ai.azure.com/chat/completions,而不是 Azure 的 URL。
561
+
562
+ 对于 Azure 渠道,base_url 兼容以下几种写法:https://your-endpoint.services.ai.azure.com/models/chat/completions?api-version=2024-05-01-preview 和 https://your-endpoint.services.ai.azure.com/models/chat/completions,https://your-endpoint.openai.azure.com,推荐使用第一种写法。如果不显式指定 api-version,默认使用 2024-10-21 版本。
563
+
564
+ - 模型超时时间是如何确认的?渠道级别的超时设置和全局模型超时设置的优先级是什么?
565
+
566
+ 渠道级别的超时设置优先级高于全局模型超时设置。优先级顺序:渠道级别模型超时设置 > 渠道级别默认超时设置 > 全局模型超时设置 > 全局默认超时设置 > 环境变量 TIMEOUT。
567
+
568
+ 通过调整模型超时时间,可以避免出现某些渠道请求超时报错的情况。如果你遇到 `{'error': '500', 'details': 'fetch_response_stream Read Response Timeout'}` 错误,请尝试增加模型超时时间。
569
+
570
+ - api_key_rate_limit 是怎么工作的?我如何给多个模型设置相同的频率限制?
571
+
572
+ 如果你想同时给 gemini-1.5-pro-latest,gemini-1.5-pro,gemini-1.5-pro-001,gemini-1.5-pro-002 这四个模型设置相同的频率限制,可以这样设置:
573
+
574
+ ```yaml
575
+ api_key_rate_limit:
576
+ gemini-1.5-pro: 1000/min
577
+ ```
578
+
579
+ 这会匹配所有含有 gemini-1.5-pro 字符串的模型。gemini-1.5-pro-latest,gemini-1.5-pro,gemini-1.5-pro-001,gemini-1.5-pro-002 这四个模型频率限制都会设置为 1000/min。api_key_rate_limit 字段配置的逻辑如下,这是一个示例配置文件:
580
+
581
+ ```yaml
582
+ api_key_rate_limit:
583
+ gemini-1.5-pro: 1000/min
584
+ gemini-1.5-pro-002: 500/min
585
+ ```
586
+
587
+ 此时如果有一个使用模型 gemini-1.5-pro-002 的请求。
588
+
589
+ 首先,uni-api 会尝试精确匹配 api_key_rate_limit 的模型。如果刚好设置了 gemini-1.5-pro-002 的频率限制,则 gemini-1.5-pro-002 的频率限制则为 500/min,如果此时请求的模型不是 gemini-1.5-pro-002,而是 gemini-1.5-pro-latest,由于 api_key_rate_limit 没有设置 gemini-1.5-pro-latest 的频率限制,因此会寻找有没有前缀和 gemini-1.5-pro-latest 相同的模型被设置了,因此 gemini-1.5-pro-latest 的频率限制会被设置为 1000/min。
590
+
591
+ - 我想设置渠道1和渠道2为随机轮训,uni-api 在渠道1和渠道2请求失败后才自动重试渠道3,怎么设置?
592
+
593
+ uni-api 支持将 api key 本身作为渠道,可以通过这一特性对渠道进行分组管理。
594
+
595
+ ```yaml
596
+ api_keys:
597
+ - api: sk-xxx1
598
+ model:
599
+ - sk-xxx2/* # 渠道 1 2 采用随机轮训,失败后请求渠道3
600
+ - aws/* # 渠道3
601
+ preferences:
602
+ SCHEDULING_ALGORITHM: fixed_priority # 表示始终优先请求 api key:sk-xxx2 里面的渠道 1 2,失败后自动请求渠道 3
603
+
604
+ - api: sk-xxx2
605
+ model:
606
+ - anthropic/claude-3-7-sonnet # 渠道1
607
+ - openrouter/claude-3-7-sonnet # 渠道2
608
+ preferences:
609
+ SCHEDULING_ALGORITHM: random # 渠道 1 2 采用随机轮训
610
+ ```
611
+
612
+ - 我想使用 Cloudflare AI Gateway,怎么填写 base_url?
613
+
614
+ 对于 gemini 渠道,Cloudflare AI Gateway 的 base_url 需要填写为 https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_name}/google-ai-studio/v1beta/openai/chat/completions ,{account_id} 和 {gateway_name} 需要替换为你的 Cloudflare 账户 ID 和 Gateway 名称。
615
+
616
+ 对于 Vertex 渠道,Cloudflare AI Gateway 的 base_url 需要填写为 https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_name}/google-vertex-ai ,{account_id} 和 {gateway_name} 需要替换为你的 Cloudflare 账户 ID 和 Gateway 名称。
617
+
618
+ ## ⭐ Star 历史
619
+
620
+ <a href="https://github.com/yym68686/uni-api/stargazers">
621
+ <img width="500" alt="Star History Chart" src="https://api.star-history.com/svg?repos=yym68686/uni-api&type=Date">
622
+ </a>