caidaohz commited on
Commit
ebcd387
·
1 Parent(s): b6bb5db

feat: Initialize OnDemand2API with Go, including API key management and chat completion functionality

Browse files

- Removed requirements.txt as the project is now Go-based
- Added .gitignore to exclude .env files
- Created config.example.env for environment variable configuration
- Initialized go.mod and go.sum for dependency management
- Implemented main.go with chat completion API, session management, and error handling
- Added health check and model listing endpoints
- Integrated logging and middleware for API key validation

Files changed (11) hide show
  1. .gitignore +1 -0
  2. Dockerfile +40 -10
  3. README.md +229 -2
  4. config.example.env +32 -0
  5. docker-compose.yml +17 -11
  6. go.mod +36 -0
  7. go.sum +90 -0
  8. gunicorn.conf.py +0 -81
  9. main.go +812 -0
  10. openai_ondemand_adapter.py +0 -325
  11. requirements.txt +0 -3
.gitignore ADDED
@@ -0,0 +1 @@
 
 
1
+ .env
Dockerfile CHANGED
@@ -1,11 +1,41 @@
1
- FROM python:3.10-slim
2
- # 安装pip依赖
3
- WORKDIR /workspace
4
- COPY requirements.txt .
5
- RUN pip install --no-cache-dir -r requirements.txt
6
- # 复制你的源码
7
- COPY . .
8
- # Space 必须监听 0.0.0.0:7860 或 3000,建议 7860!
9
- ENV PORT=7860
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  EXPOSE 7860
11
- CMD ["gunicorn", "--config", "gunicorn.conf.py", "openai_ondemand_adapter:app"]
 
 
 
 
 
 
 
1
+ # 多阶段构建
2
+ FROM golang:1.21-alpine AS builder
3
+
4
+ # 设置工作目录
5
+ WORKDIR /app
6
+
7
+ # 安装必要的工具
8
+ RUN apk add --no-cache git
9
+
10
+ # 复制go mod文件
11
+ COPY go.mod go.sum ./
12
+
13
+ # 下载依赖
14
+ RUN go mod download
15
+
16
+ # 复制源代码
17
+ COPY main.go ./
18
+
19
+ # 构建应用
20
+ RUN CGO_ENABLED=0 GOOS=linux go build -a -installsuffix cgo -o main .
21
+
22
+ # 运行阶段
23
+ FROM alpine:latest
24
+
25
+ # 安装ca-certificates用于HTTPS请求
26
+ RUN apk --no-cache add ca-certificates curl
27
+
28
+ WORKDIR /root/
29
+
30
+ # 从构建阶段复制二进制文件
31
+ COPY --from=builder /app/main .
32
+
33
+ # 暴露端口
34
  EXPOSE 7860
35
+
36
+ # 健康检查
37
+ HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
38
+ CMD curl -f http://localhost:7860/v1/models || exit 1
39
+
40
+ # 运行应用
41
+ CMD ["./main"]
README.md CHANGED
@@ -1,5 +1,5 @@
1
  ---
2
- title: openai ondemand adapter
3
  emoji: 😻
4
  colorFrom: red
5
  colorTo: red
@@ -7,4 +7,231 @@ sdk: docker
7
  pinned: false
8
  ---
9
 
10
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: OnDemand2Api
3
  emoji: 😻
4
  colorFrom: red
5
  colorTo: red
 
7
  pinned: false
8
  ---
9
 
10
+ # OpenAI OnDemand Adapter - Go版本
11
+
12
+ 这是一个高性能的Go语言实现版本,将OpenAI API请求转换为OnDemand API调用,支持异步并发处理。
13
+
14
+ ## 主要特性
15
+
16
+ ### 🚀 性能优化
17
+ - **异步并发处理**:使用Goroutines和Channels实现高并发请求处理
18
+ - **连接池复用**:HTTP客户端连接复用,减少连接开销
19
+ - **内存优化**:高效的内存管理,避免内存泄漏
20
+ - **多阶段Docker构建**:最小化最终镜像大小
21
+
22
+ ### 🔧 核心功能
23
+ - **API密钥管理**:支持多个API密钥的自动轮换和故障转移
24
+ - **会话管理**:智能维护OnDemand API会话状态,支持会话超时自动重置
25
+ - **流式响应**:支持Server-Sent Events (SSE)流式响应
26
+ - **模型映射**:灵活的OpenAI模型到OnDemand端点的映射
27
+ - **错误处理**:完善的错误处理和自动重试机制
28
+ - **健康检查**:内置健康检查端点
29
+
30
+ ### 🛡️ 安全特性
31
+ - **API鉴权**:支持Authorization Bearer Token和X-API-KEY头部鉴权
32
+ - **只读文件系统**:Docker容器使用只读根文件系统提高安全性
33
+ - **资源限制**:Docker容器资源限制和安全配置
34
+
35
+ ## 快速开始
36
+
37
+ ### 环境要求
38
+ - Go 1.21+
39
+ - Docker & Docker Compose (可选)
40
+
41
+ ### 本地运行
42
+
43
+ 1. **克隆项目并安装依赖**
44
+ ```bash
45
+ git clone <repository>
46
+ cd ondemand2api
47
+ go mod download
48
+ ```
49
+
50
+ 2. **设置环境变量**
51
+ ```bash
52
+ export PRIVATE_KEY="your_private_key_here"
53
+ export ONDEMAND_APIKEYS="key1,key2,key3"
54
+ export PORT=7860 # 可选,默认7860
55
+ export GIN_MODE=release # 可选:debug, release, test
56
+ ```
57
+
58
+ 3. **运行应用**
59
+ ```bash
60
+ go run main.go
61
+ ```
62
+
63
+ ### Docker运行
64
+
65
+ 1. **构建并运行**
66
+ ```bash
67
+ # 构建镜像
68
+ docker build -t ondemand2api .
69
+
70
+ # 运行容器
71
+ docker run -p 7860:7860 \
72
+ -e PRIVATE_KEY="your_private_key_here" \
73
+ -e ONDEMAND_APIKEYS="key1,key2,key3" \
74
+ ondemand2api
75
+ ```
76
+
77
+ 2. **使用Docker Compose**
78
+ ```bash
79
+ # 编辑docker-compose.yml中的环境变量
80
+ # 然后运行:
81
+ docker-compose up -d
82
+ ```
83
+
84
+ ## API接口
85
+
86
+ ### 聊天完成接口
87
+ ```http
88
+ POST /v1/chat/completions
89
+ Authorization: Bearer your_private_key_here
90
+ Content-Type: application/json
91
+
92
+ {
93
+ "model": "gpt-4o",
94
+ "messages": [
95
+ {"role": "user", "content": "Hello!"}
96
+ ],
97
+ "stream": false
98
+ }
99
+ ```
100
+
101
+ ### 模型列表接口
102
+ ```http
103
+ GET /v1/models
104
+ Authorization: Bearer your_private_key_here
105
+ ```
106
+
107
+ ### 健康检查接口
108
+ ```http
109
+ GET /
110
+ ```
111
+
112
+ ## 配置说明
113
+
114
+ ### 环境变量
115
+
116
+ | 变量名 | 必需 | 默认值 | 说明 |
117
+ |--------|------|--------|------|
118
+ | `PRIVATE_KEY` | 是 | testofli | API访问密钥 |
119
+ | `ONDEMAND_APIKEYS` | 是 | - | OnDemand API密钥列表,逗号分隔 |
120
+ | `PORT` | 否 | 7860 | 服务端口 |
121
+ | `GIN_MODE` | 否 | release | Gin运行模式 |
122
+
123
+ ### 支持的模型映射
124
+
125
+ | OpenAI模型 | OnDemand端点 |
126
+ |------------|--------------|
127
+ | o3 | predefined-openai-gpto3 |
128
+ | o3-mini | predefined-openai-gpto3-mini |
129
+ | gpt-4o | predefined-openai-gpt4o |
130
+ | gpt-4.1 | predefined-openai-gpt4.1 |
131
+ | deepseek-v3 | predefined-deepseek-v3 |
132
+ | deepseek-r1 | predefined-deepseek-r1 |
133
+ | claude-4-sonnet | predefined-claude-4-sonnet |
134
+ | gemini-2.5-pro | predefined-gemini-2.5-pro-preview |
135
+
136
+ ## 性能特性
137
+
138
+ ### 并发处理
139
+ - **Goroutines**:每个请求在独立的goroutine中处理
140
+ - **Channel通信**:使用带缓冲的channel处理流式响应
141
+ - **连接复用**:HTTP客户端自动复用连接
142
+ - **超时控制**:完善的上下文超时控制
143
+
144
+ ### 内存管理
145
+ - **垃圾回收优化**:合理的对象生命周期管理
146
+ - **缓冲区复用**:高效的内存缓冲区使用
147
+ - **资源自动释放**:defer语句确保资源及时释放
148
+
149
+ ### 错误处理
150
+ - **分级重试**:根据错误类型进行智能重试
151
+ - **熔断机制**:自动检测和恢复故障的API密钥
152
+ - **日志记录**:详细的操作日志和错误追踪
153
+
154
+ ## 监控和日志
155
+
156
+ ### 日志输出
157
+ 应用使用结构化日志输出,包含:
158
+ - 请求处理信息
159
+ - API密钥使用状态
160
+ - 会话管理状态
161
+ - 错误和异常信息
162
+
163
+ ### 健康检查
164
+ - HTTP健康检查端点:`GET /`
165
+ - Docker健康检查:自动检查服务可用性
166
+ - 返回API密钥池状态
167
+
168
+ ## 与Python版本的对比
169
+
170
+ | 特性 | Python版本 | Go版本 |
171
+ |------|------------|--------|
172
+ | **性能** | 中等 | 高 |
173
+ | **并发处理** | 线程池 | Goroutines |
174
+ | **内存使用** | 较高 | 较低 |
175
+ | **启动时间** | 较慢 | 快 |
176
+ | **资源占用** | 高 | 低 |
177
+ | **并发能力** | 受GIL限制 | 原生并发 |
178
+ | **部署大小** | 大 | 小 |
179
+
180
+ ## 开发说明
181
+
182
+ ### 项目结构
183
+ ```
184
+ .
185
+ ├── main.go # 主应用文件
186
+ ├── go.mod # Go模块定义
187
+ ├── go.sum # 依赖锁定文件
188
+ ├── Dockerfile # Docker构建文件
189
+ ├── docker-compose.yml # Docker Compose配置
190
+ └── README.md # 项目文档
191
+ ```
192
+
193
+ ### 关键组件
194
+
195
+ 1. **KeyManager**: API密钥管理器
196
+ - 自动轮换密钥
197
+ - 故障检测和恢复
198
+ - 会话状态管理
199
+
200
+ 2. **HTTP处理器**:
201
+ - Gin框架路由
202
+ - 中间件鉴权
203
+ - 流式响应处理
204
+
205
+ 3. **并发控制**:
206
+ - Context超时控制
207
+ - Goroutine池管理
208
+ - Channel通信
209
+
210
+ ## 故障排查
211
+
212
+ ### 常见问题
213
+
214
+ 1. **端口占用**
215
+ ```bash
216
+ # 检查端口占用
217
+ lsof -i :7860
218
+ # 或使用不同端口
219
+ export PORT=8080
220
+ ```
221
+
222
+ 2. **API密钥问题**
223
+ ```bash
224
+ # 检查环境变量
225
+ echo $ONDEMAND_APIKEYS
226
+ # 查看日志输出的密钥状态
227
+ ```
228
+
229
+ 3. **内存使用**
230
+ ```bash
231
+ # 监控容器资源使用
232
+ docker stats ondemand2api
233
+ ```
234
+
235
+ ## 许可证
236
+
237
+ 本项目基于原Python项目进行Go语言重构,保持相同的功能特性并增强了性能和并发能力。
config.example.env ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # OnDemand2API Go版本 - 环境变量配置示例
2
+
3
+ # ====== 必需配置 ======
4
+ # API访问私钥(用于客户端鉴权)
5
+ PRIVATE_KEY=your_private_key_here
6
+
7
+ # OnDemand API密钥列表(逗号分隔,支持多个密钥轮换)
8
+ ONDEMAND_APIKEYS=key1,key2,key3
9
+
10
+ # ====== 可选配置 ======
11
+ # 服务端口(默认7860)
12
+ PORT=7860
13
+
14
+ # Gin运行模式(debug, release, test)
15
+ GIN_MODE=release
16
+
17
+ # ====== 高级配置 ======
18
+ # 如果需要自定义OnDemand API基础URL(通常不需要修改)
19
+ # ONDEMAND_API_BASE=https://api.on-demand.io/chat/v1
20
+
21
+ # ====== 使用说明 ======
22
+ # 1. 复制此文件为 .env 或直接设置环境变量
23
+ # 2. 修改上述配置值为实际值
24
+ # 3. 运行应用:
25
+ # - 本地运行:make run
26
+ # - Docker运行:make docker-run
27
+ # - Docker Compose:make docker-compose-up
28
+
29
+ # ====== 安全提示 ======
30
+ # - 请妥善保管 PRIVATE_KEY 和 ONDEMAND_APIKEYS
31
+ # - 不要将包含真实密钥的配置文件提交到代码仓库
32
+ # - 生产环境建议使用环境变量或密钥管理服务
docker-compose.yml CHANGED
@@ -2,7 +2,9 @@ version: '3.8'
2
 
3
  services:
4
  ondemand2api:
5
- build: .
 
 
6
  ports:
7
  - "7860:7860"
8
  environment:
@@ -10,24 +12,19 @@ services:
10
  - PRIVATE_KEY=your_private_key_here
11
  - ONDEMAND_APIKEYS=key1,key2,key3
12
 
13
- # Gunicorn配置(可选,有默认值)
14
- - GUNICORN_WORKERS=4 # Worker进程数,默认为CPU核心数*2+1
15
- - GUNICORN_THREADS=4 # 每个Worker的线程数,默认为4
16
- - GUNICORN_TIMEOUT=120 # 超时时间(秒),默认为120
17
-
18
  # 服务器配置(可选)
19
  - PORT=7860 # 端口,默认为7860
20
- - HOST=0.0.0.0 # 绑定地址,默认为0.0.0.0
21
 
22
  # 资源限制(可选)
23
  deploy:
24
  resources:
25
  limits:
26
  cpus: '2.0'
27
- memory: 1G
28
- reservations:
29
- cpus: '0.5'
30
  memory: 512M
 
 
 
31
 
32
  # 健康检查
33
  healthcheck:
@@ -35,7 +32,7 @@ services:
35
  interval: 30s
36
  timeout: 10s
37
  retries: 3
38
- start_period: 40s
39
 
40
  # 重启策略
41
  restart: unless-stopped
@@ -46,3 +43,12 @@ services:
46
  options:
47
  max-size: "10m"
48
  max-file: "3"
 
 
 
 
 
 
 
 
 
 
2
 
3
  services:
4
  ondemand2api:
5
+ build:
6
+ context: .
7
+ dockerfile: Dockerfile
8
  ports:
9
  - "7860:7860"
10
  environment:
 
12
  - PRIVATE_KEY=your_private_key_here
13
  - ONDEMAND_APIKEYS=key1,key2,key3
14
 
 
 
 
 
 
15
  # 服务器配置(可选)
16
  - PORT=7860 # 端口,默认为7860
17
+ - GIN_MODE=release # Gin运行模式:debug, release, test
18
 
19
  # 资源限制(可选)
20
  deploy:
21
  resources:
22
  limits:
23
  cpus: '2.0'
 
 
 
24
  memory: 512M
25
+ reservations:
26
+ cpus: '0.25'
27
+ memory: 128M
28
 
29
  # 健康检查
30
  healthcheck:
 
32
  interval: 30s
33
  timeout: 10s
34
  retries: 3
35
+ start_period: 10s
36
 
37
  # 重启策略
38
  restart: unless-stopped
 
43
  options:
44
  max-size: "10m"
45
  max-file: "3"
46
+
47
+ # 安全配置
48
+ security_opt:
49
+ - no-new-privileges:true
50
+
51
+ # 只读根文件系统(提高安全性)
52
+ read_only: true
53
+ tmpfs:
54
+ - /tmp:noexec,nosuid,size=100m
go.mod ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ module ondemand2api
2
+
3
+ go 1.21
4
+
5
+ require (
6
+ github.com/gin-gonic/gin v1.9.1
7
+ github.com/google/uuid v1.5.0
8
+ github.com/joho/godotenv v1.5.1
9
+ )
10
+
11
+ require (
12
+ github.com/bytedance/sonic v1.9.1 // indirect
13
+ github.com/chenzhuoyu/base64x v0.0.0-20221115062448-fe3a3abad311 // indirect
14
+ github.com/gabriel-vasile/mimetype v1.4.2 // indirect
15
+ github.com/gin-contrib/sse v0.1.0 // indirect
16
+ github.com/go-playground/locales v0.14.1 // indirect
17
+ github.com/go-playground/universal-translator v0.18.1 // indirect
18
+ github.com/go-playground/validator/v10 v10.14.0 // indirect
19
+ github.com/goccy/go-json v0.10.2 // indirect
20
+ github.com/json-iterator/go v1.1.12 // indirect
21
+ github.com/klauspost/cpuid/v2 v2.2.4 // indirect
22
+ github.com/leodido/go-urn v1.2.4 // indirect
23
+ github.com/mattn/go-isatty v0.0.19 // indirect
24
+ github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
25
+ github.com/modern-go/reflect2 v1.0.2 // indirect
26
+ github.com/pelletier/go-toml/v2 v2.0.8 // indirect
27
+ github.com/twitchyliquid64/golang-asm v0.15.1 // indirect
28
+ github.com/ugorji/go/codec v1.2.11 // indirect
29
+ golang.org/x/arch v0.3.0 // indirect
30
+ golang.org/x/crypto v0.9.0 // indirect
31
+ golang.org/x/net v0.10.0 // indirect
32
+ golang.org/x/sys v0.8.0 // indirect
33
+ golang.org/x/text v0.9.0 // indirect
34
+ google.golang.org/protobuf v1.30.0 // indirect
35
+ gopkg.in/yaml.v3 v3.0.1 // indirect
36
+ )
go.sum ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ github.com/bytedance/sonic v1.5.0/go.mod h1:ED5hyg4y6t3/9Ku1R6dU/4KyJ48DZ4jPhfY1O2AihPM=
2
+ github.com/bytedance/sonic v1.9.1 h1:6iJ6NqdoxCDr6mbY8h18oSO+cShGSMRGCEo7F2h0x8s=
3
+ github.com/bytedance/sonic v1.9.1/go.mod h1:i736AoUSYt75HyZLoJW9ERYxcy6eaN6h4BZXU064P/U=
4
+ github.com/chenzhuoyu/base64x v0.0.0-20211019084208-fb5309c8db06/go.mod h1:DH46F32mSOjUmXrMHnKwZdA8wcEefY7UVqBKYGjpdQY=
5
+ github.com/chenzhuoyu/base64x v0.0.0-20221115062448-fe3a3abad311 h1:qSGYFH7+jGhDF8vLC+iwCD4WpbV1EBDSzWkJODFLams=
6
+ github.com/chenzhuoyu/base64x v0.0.0-20221115062448-fe3a3abad311/go.mod h1:b583jCggY9gE99b6G5LEC39OIiVsWj+R97kbl5odCEk=
7
+ github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
8
+ github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
9
+ github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
10
+ github.com/gabriel-vasile/mimetype v1.4.2 h1:w5qFW6JKBz9Y393Y4q372O9A7cUSequkh1Q7OhCmWKU=
11
+ github.com/gabriel-vasile/mimetype v1.4.2/go.mod h1:zApsH/mKG4w07erKIaJPFiX0Tsq9BFQgN3qGY5GnNgA=
12
+ github.com/gin-contrib/sse v0.1.0 h1:Y/yl/+YNO8GZSjAhjMsSuLt29uWRFHdHYUb5lYOV9qE=
13
+ github.com/gin-contrib/sse v0.1.0/go.mod h1:RHrZQHXnP2xjPF+u1gW/2HnVO7nvIa9PG3Gm+fLHvGI=
14
+ github.com/gin-gonic/gin v1.9.1 h1:4idEAncQnU5cB7BeOkPtxjfCSye0AAm1R0RVIqJ+Jmg=
15
+ github.com/gin-gonic/gin v1.9.1/go.mod h1:hPrL7YrpYKXt5YId3A/Tnip5kqbEAP+KLuI3SUcPTeU=
16
+ github.com/go-playground/assert/v2 v2.2.0 h1:JvknZsQTYeFEAhQwI4qEt9cyV5ONwRHC+lYKSsYSR8s=
17
+ github.com/go-playground/assert/v2 v2.2.0/go.mod h1:VDjEfimB/XKnb+ZQfWdccd7VUvScMdVu0Titje2rxJ4=
18
+ github.com/go-playground/locales v0.14.1 h1:EWaQ/wswjilfKLTECiXz7Rh+3BjFhfDFKv/oXslEjJA=
19
+ github.com/go-playground/locales v0.14.1/go.mod h1:hxrqLVvrK65+Rwrd5Fc6F2O76J/NuW9t0sjnWqG1slY=
20
+ github.com/go-playground/universal-translator v0.18.1 h1:Bcnm0ZwsGyWbCzImXv+pAJnYK9S473LQFuzCbDbfSFY=
21
+ github.com/go-playground/universal-translator v0.18.1/go.mod h1:xekY+UJKNuX9WP91TpwSH2VMlDf28Uj24BCp08ZFTUY=
22
+ github.com/go-playground/validator/v10 v10.14.0 h1:vgvQWe3XCz3gIeFDm/HnTIbj6UGmg/+t63MyGU2n5js=
23
+ github.com/go-playground/validator/v10 v10.14.0/go.mod h1:9iXMNT7sEkjXb0I+enO7QXmzG6QCsPWY4zveKFVRSyU=
24
+ github.com/goccy/go-json v0.10.2 h1:CrxCmQqYDkv1z7lO7Wbh2HN93uovUHgrECaO5ZrCXAU=
25
+ github.com/goccy/go-json v0.10.2/go.mod h1:6MelG93GURQebXPDq3khkgXZkazVtN9CRI+MGFi0w8I=
26
+ github.com/golang/protobuf v1.5.0/go.mod h1:FsONVRAS9T7sI+LIUmWTfcYkHO4aIWwzhcaSAoJOfIk=
27
+ github.com/google/go-cmp v0.5.5 h1:Khx7svrCpmxxtHBq5j2mp/xVjsi8hQMfNLvJFAlrGgU=
28
+ github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/gNBxE=
29
+ github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
30
+ github.com/google/uuid v1.5.0 h1:1p67kYwdtXjb0gL0BPiP1Av9wiZPo5A8z2cWkTZ+eyU=
31
+ github.com/google/uuid v1.5.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
32
+ github.com/joho/godotenv v1.5.1 h1:7eLL/+HRGLY0ldzfGMeQkb7vMd0as4CfYvUVzLqw0N0=
33
+ github.com/joho/godotenv v1.5.1/go.mod h1:f4LDr5Voq0i2e/R5DDNOoa2zzDfwtkZa6DnEwAbqwq4=
34
+ github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM=
35
+ github.com/json-iterator/go v1.1.12/go.mod h1:e30LSqwooZae/UwlEbR2852Gd8hjQvJoHmT4TnhNGBo=
36
+ github.com/klauspost/cpuid/v2 v2.0.9/go.mod h1:FInQzS24/EEf25PyTYn52gqo7WaD8xa0213Md/qVLRg=
37
+ github.com/klauspost/cpuid/v2 v2.2.4 h1:acbojRNwl3o09bUq+yDCtZFc1aiwaAAxtcn8YkZXnvk=
38
+ github.com/klauspost/cpuid/v2 v2.2.4/go.mod h1:RVVoqg1df56z8g3pUjL/3lE5UfnlrJX8tyFgg4nqhuY=
39
+ github.com/leodido/go-urn v1.2.4 h1:XlAE/cm/ms7TE/VMVoduSpNBoyc2dOxHs5MZSwAN63Q=
40
+ github.com/leodido/go-urn v1.2.4/go.mod h1:7ZrI8mTSeBSHl/UaRyKQW1qZeMgak41ANeCNaVckg+4=
41
+ github.com/mattn/go-isatty v0.0.19 h1:JITubQf0MOLdlGRuRq+jtsDlekdYPia9ZFsB8h/APPA=
42
+ github.com/mattn/go-isatty v0.0.19/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
43
+ github.com/modern-go/concurrent v0.0.0-20180228061459-e0a39a4cb421/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
44
+ github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd h1:TRLaZ9cD/w8PVh93nsPXa1VrQ6jlwL5oN8l14QlcNfg=
45
+ github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJC0mAP4ikYIbvyc7fijjWJddQyLn8Ig3JB5CqoB9Q=
46
+ github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9Gz0M=
47
+ github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
48
+ github.com/pelletier/go-toml/v2 v2.0.8 h1:0ctb6s9mE31h0/lhu+J6OPmVeDxJn+kYnJc2jZR9tGQ=
49
+ github.com/pelletier/go-toml/v2 v2.0.8/go.mod h1:vuYfssBdrU2XDZ9bYydBu6t+6a6PYNcZljzZR9VXg+4=
50
+ github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
51
+ github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
52
+ github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
53
+ github.com/stretchr/objx v0.4.0/go.mod h1:YvHI0jy2hoMjB+UWwv71VJQ9isScKT/TqJzVSSt89Yw=
54
+ github.com/stretchr/objx v0.5.0/go.mod h1:Yh+to48EsGEfYuaHDzXPcE3xhTkx73EhmCGUpEOglKo=
55
+ github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
56
+ github.com/stretchr/testify v1.7.0/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
57
+ github.com/stretchr/testify v1.7.1/go.mod h1:6Fq8oRcR53rry900zMqJjRRixrwX3KX962/h/Wwjteg=
58
+ github.com/stretchr/testify v1.8.0/go.mod h1:yNjHg4UonilssWZ8iaSj1OCr/vHnekPRkoO+kdMU+MU=
59
+ github.com/stretchr/testify v1.8.1/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
60
+ github.com/stretchr/testify v1.8.2/go.mod h1:w2LPCIKwWwSfY2zedu0+kehJoqGctiVI29o6fzry7u4=
61
+ github.com/stretchr/testify v1.8.3 h1:RP3t2pwF7cMEbC1dqtB6poj3niw/9gnV4Cjg5oW5gtY=
62
+ github.com/stretchr/testify v1.8.3/go.mod h1:sz/lmYIOXD/1dqDmKjjqLyZ2RngseejIcXlSw2iwfAo=
63
+ github.com/twitchyliquid64/golang-asm v0.15.1 h1:SU5vSMR7hnwNxj24w34ZyCi/FmDZTkS4MhqMhdFk5YI=
64
+ github.com/twitchyliquid64/golang-asm v0.15.1/go.mod h1:a1lVb/DtPvCB8fslRZhAngC2+aY1QWCk3Cedj/Gdt08=
65
+ github.com/ugorji/go/codec v1.2.11 h1:BMaWp1Bb6fHwEtbplGBGJ498wD+LKlNSl25MjdZY4dU=
66
+ github.com/ugorji/go/codec v1.2.11/go.mod h1:UNopzCgEMSXjBc6AOMqYvWC1ktqTAfzJZUZgYf6w6lg=
67
+ golang.org/x/arch v0.0.0-20210923205945-b76863e36670/go.mod h1:5om86z9Hs0C8fWVUuoMHwpExlXzs5Tkyp9hOrfG7pp8=
68
+ golang.org/x/arch v0.3.0 h1:02VY4/ZcO/gBOH6PUaoiptASxtXU10jazRCP865E97k=
69
+ golang.org/x/arch v0.3.0/go.mod h1:5om86z9Hs0C8fWVUuoMHwpExlXzs5Tkyp9hOrfG7pp8=
70
+ golang.org/x/crypto v0.9.0 h1:LF6fAI+IutBocDJ2OT0Q1g8plpYljMZ4+lty+dsqw3g=
71
+ golang.org/x/crypto v0.9.0/go.mod h1:yrmDGqONDYtNj3tH8X9dzUun2m2lzPa9ngI6/RUPGR0=
72
+ golang.org/x/net v0.10.0 h1:X2//UzNDwYmtCLn7To6G58Wr6f5ahEAQgKNzv9Y951M=
73
+ golang.org/x/net v0.10.0/go.mod h1:0qNGK6F8kojg2nk9dLZ2mShWaEBan6FAoqfSigmmuDg=
74
+ golang.org/x/sys v0.0.0-20220704084225-05e143d24a9e/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
75
+ golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
76
+ golang.org/x/sys v0.8.0 h1:EBmGv8NaZBZTWvrbjNoL6HVt+IVy3QDQpJs7VRIw3tU=
77
+ golang.org/x/sys v0.8.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
78
+ golang.org/x/text v0.9.0 h1:2sjJmO8cDvYveuX97RDLsxlyUxLl+GHoLxBiRdHllBE=
79
+ golang.org/x/text v0.9.0/go.mod h1:e1OnstbJyHTd6l/uOt8jFFHp6TRDWZR/bV3emEE/zU8=
80
+ golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543 h1:E7g+9GITq07hpfrRu66IVDexMakfv52eLZ2CXBWiKr4=
81
+ golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543/go.mod h1:I/5z698sn9Ka8TeJc9MKroUUfqBBauWjQqLJ2OPfmY0=
82
+ google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp09yW+WbY/TyQbw=
83
+ google.golang.org/protobuf v1.30.0 h1:kPPoIgf3TsEvrm0PFe15JQ+570QVxYzEvvHqChK+cng=
84
+ google.golang.org/protobuf v1.30.0/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I=
85
+ gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405 h1:yhCVgyC4o1eVCa2tZl7eS0r+SDo693bJlVdllGtEeKM=
86
+ gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
87
+ gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
88
+ gopkg.in/yaml.v3 v3.0.1 h1:fxVm/GzAzEWqLHuvctI91KS9hhNmmWOoWu0XTYJS7CA=
89
+ gopkg.in/yaml.v3 v3.0.1/go.mod h1:K4uyk7z7BCEPqu6E+C64Yfv1cQ7kz7rIZviUmN+EgEM=
90
+ rsc.io/pdf v0.1.1/go.mod h1:n8OzWcQ6Sp37PL01nO98y4iUCRdTGarVfzxY20ICaU4=
gunicorn.conf.py DELETED
@@ -1,81 +0,0 @@
1
- # Gunicorn配置文件
2
- import os
3
- # import multiprocessing
4
-
5
- # 服务器套接字
6
- bind = "0.0.0.0:7860"
7
- backlog = 2048
8
-
9
- # Worker进程
10
- workers = 2 # 推荐的worker数量
11
- worker_class = "gthread" # 使用线程worker类
12
- threads = 4 # 每个worker的线程数
13
- worker_connections = 1000
14
- max_requests = 1000 # 每个worker处理的最大请求数,防止内存泄漏
15
- max_requests_jitter = 50 # 随机化max_requests,避免所有worker同时重启
16
-
17
- # 超时设置
18
- timeout = 120 # worker超时时间(秒)
19
- keepalive = 2 # keep-alive连接的超时时间
20
-
21
- # 日志
22
- loglevel = "info"
23
- accesslog = "-" # 输出到stdout
24
- errorlog = "-" # 输出到stderr
25
- access_log_format = '%(h)s %(l)s %(u)s %(t)s "%(r)s" %(s)s %(b)s "%(f)s" "%(a)s" %(D)s'
26
-
27
- # 进程命名
28
- proc_name = "ondemand2api"
29
-
30
- # 预加载应用
31
- preload_app = True
32
-
33
- # 临时目录
34
- tmp_upload_dir = None
35
-
36
- # 安全
37
- limit_request_line = 4094
38
- limit_request_fields = 100
39
- limit_request_field_size = 8190
40
-
41
- # 性能调优
42
- worker_tmp_dir = "/dev/shm" # 使用内存文件系统作为临时目录(如果可用)
43
-
44
- # 环境变量配置覆盖
45
- if os.environ.get("GUNICORN_WORKERS"):
46
- workers = int(os.environ.get("GUNICORN_WORKERS"))
47
-
48
- if os.environ.get("GUNICORN_THREADS"):
49
- threads = int(os.environ.get("GUNICORN_THREADS"))
50
-
51
- if os.environ.get("GUNICORN_TIMEOUT"):
52
- timeout = int(os.environ.get("GUNICORN_TIMEOUT"))
53
-
54
- # 钩子函数
55
- def on_starting(server):
56
- server.log.info("服务器启动中...")
57
-
58
- def on_reload(server):
59
- server.log.info("服务器重新加载中...")
60
-
61
- def when_ready(server):
62
- server.log.info(f"服务器已就绪,监听 {bind}")
63
- server.log.info(f"Workers: {workers}, Threads per worker: {threads}")
64
-
65
- def worker_int(worker):
66
- worker.log.info("Worker收到INT或QUIT信号")
67
-
68
- def pre_fork(server, worker):
69
- server.log.info(f"Worker {worker.pid} 即将启动")
70
-
71
- def post_fork(server, worker):
72
- server.log.info(f"Worker {worker.pid} 已启动")
73
-
74
- def post_worker_init(worker):
75
- worker.log.info(f"Worker {worker.pid} 初始化完成")
76
-
77
- def worker_abort(worker):
78
- worker.log.info(f"Worker {worker.pid} 异常退出")
79
-
80
- def pre_exec(server):
81
- server.log.info("服务器即将重新执行")
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
main.go ADDED
@@ -0,0 +1,812 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ package main
2
+
3
+ import (
4
+ "bufio"
5
+ "bytes"
6
+ "context"
7
+ "encoding/json"
8
+ "fmt"
9
+ "log"
10
+ "net/http"
11
+ "os"
12
+ "strconv"
13
+ "strings"
14
+ "sync"
15
+ "time"
16
+
17
+ "github.com/gin-gonic/gin"
18
+ "github.com/google/uuid"
19
+ "github.com/joho/godotenv"
20
+ )
21
+
22
+ // 配置常量
23
+ const (
24
+ BadKeyRetryInterval = 600 * time.Second // 10分钟
25
+ SessionTimeout = 600 * time.Second // 10分钟
26
+ DefaultPort = 7860
27
+ )
28
+
29
+ // 全局变量
30
+ var (
31
+ privateKey string
32
+ ondemandAPIKeys []string
33
+ safeHeaders = []string{"Authorization", "X-API-KEY"}
34
+ ondemandAPIBase = "https://api.on-demand.io/chat/v1"
35
+ defaultModel = "predefined-openai-gpt4o"
36
+ )
37
+
38
+ // 模型映射
39
+ var modelMap = map[string]string{
40
+ "o3-mini": "predefined-openai-gpto3-mini",
41
+ "o4-mini": "predefined-openai-gpto4-mini",
42
+ "gpt-4o": "predefined-openai-gpt4o",
43
+ "gpt-4.1": "predefined-openai-gpt4.1",
44
+ "gpt-4.1-mini": "predefined-openai-gpt4.1-mini",
45
+ "gpt-4o-mini": "predefined-openai-gpt4o-mini",
46
+ "deepseek-v3": "predefined-deepseek-v3",
47
+ "deepseek-r1": "predefined-deepseek-r1",
48
+ "claude-4-sonnet": "predefined-claude-4-sonnet",
49
+ "claude-4-opus": "predefined-claude-4-opus",
50
+ }
51
+
52
+ // KeyStatus 表示API密钥的状态
53
+ type KeyStatus struct {
54
+ Bad bool `json:"bad"`
55
+ BadTS time.Time `json:"bad_ts"`
56
+ }
57
+
58
+ // KeyManager 管理API密钥的轮换和状态
59
+ type KeyManager struct {
60
+ keyList []string
61
+ mu sync.RWMutex
62
+ keyStatus map[string]*KeyStatus
63
+ idx int
64
+ currentKey string
65
+ currentSession string
66
+ lastUsedTime time.Time
67
+ }
68
+
69
+ // NewKeyManager 创建新的密钥管理器
70
+ func NewKeyManager(keys []string) *KeyManager {
71
+ km := &KeyManager{
72
+ keyList: make([]string, len(keys)),
73
+ keyStatus: make(map[string]*KeyStatus),
74
+ }
75
+ copy(km.keyList, keys)
76
+
77
+ for _, key := range keys {
78
+ km.keyStatus[key] = &KeyStatus{}
79
+ }
80
+
81
+ return km
82
+ }
83
+
84
+ // displayKey 显示密钥的简化版本
85
+ func (km *KeyManager) displayKey(key string) string {
86
+ if len(key) <= 10 {
87
+ return key
88
+ }
89
+ return fmt.Sprintf("%s...%s", key[:6], key[len(key)-4:])
90
+ }
91
+
92
+ // Get 获取可用的API密钥
93
+ func (km *KeyManager) Get() string {
94
+ km.mu.Lock()
95
+ defer km.mu.Unlock()
96
+
97
+ now := time.Now()
98
+
99
+ // 检查会话是否超时
100
+ if km.currentKey != "" && !km.lastUsedTime.IsZero() &&
101
+ now.Sub(km.lastUsedTime) > SessionTimeout {
102
+ log.Printf("【对话超时】上次使用时间: %s", km.lastUsedTime.Format("2006-01-02 15:04:05"))
103
+ log.Printf("【对话超时】当前时间: %s", now.Format("2006-01-02 15:04:05"))
104
+ log.Printf("【对话超时】超时%d分钟,切换新会话", int(SessionTimeout.Minutes()))
105
+ km.currentKey = ""
106
+ km.currentSession = ""
107
+ }
108
+
109
+ // 如果已有正在使用的key,继续使用
110
+ if km.currentKey != "" {
111
+ if !km.keyStatus[km.currentKey].Bad {
112
+ log.Printf("【对话请求】【继续使用API KEY: %s】【状态:正常】", km.displayKey(km.currentKey))
113
+ km.lastUsedTime = now
114
+ return km.currentKey
115
+ } else {
116
+ // 当前key已标记为异常,需要切换
117
+ km.currentKey = ""
118
+ km.currentSession = ""
119
+ }
120
+ }
121
+
122
+ // 选择新的key
123
+ total := len(km.keyList)
124
+ for i := 0; i < total; i++ {
125
+ key := km.keyList[km.idx]
126
+ km.idx = (km.idx + 1) % total
127
+ status := km.keyStatus[key]
128
+
129
+ if !status.Bad {
130
+ log.Printf("【对话请求】【使用新API KEY: %s】【状态:正常】", km.displayKey(key))
131
+ km.currentKey = key
132
+ km.currentSession = ""
133
+ km.lastUsedTime = now
134
+ return key
135
+ }
136
+
137
+ if status.Bad && !status.BadTS.IsZero() {
138
+ if now.Sub(status.BadTS) >= BadKeyRetryInterval {
139
+ log.Printf("【KEY自动尝试恢复】API KEY: %s 满足重试周期,标记为正常", km.displayKey(key))
140
+ status.Bad = false
141
+ status.BadTS = time.Time{}
142
+ km.currentKey = key
143
+ km.currentSession = ""
144
+ km.lastUsedTime = now
145
+ return key
146
+ }
147
+ }
148
+ }
149
+
150
+ // 所有密钥都不可用,强制重置
151
+ log.Printf("【警告】全部KEY已被禁用,强制选用第一个KEY继续尝试: %s", km.displayKey(km.keyList[0]))
152
+ for _, key := range km.keyList {
153
+ km.keyStatus[key].Bad = false
154
+ km.keyStatus[key].BadTS = time.Time{}
155
+ }
156
+ km.idx = 0
157
+ km.currentKey = km.keyList[0]
158
+ km.currentSession = ""
159
+ km.lastUsedTime = now
160
+ log.Printf("【对话请求】【使用API KEY: %s】【状态:强制尝试(全部异常)】", km.displayKey(km.currentKey))
161
+ return km.currentKey
162
+ }
163
+
164
+ // MarkBad 标记密钥为不可用
165
+ func (km *KeyManager) MarkBad(key string) {
166
+ km.mu.Lock()
167
+ defer km.mu.Unlock()
168
+
169
+ if status, exists := km.keyStatus[key]; exists && !status.Bad {
170
+ log.Printf("【禁用KEY】API KEY: %s,接口返回无效(将在%d分钟后自动重试)",
171
+ km.displayKey(key), int(BadKeyRetryInterval.Minutes()))
172
+ status.Bad = true
173
+ status.BadTS = time.Now()
174
+
175
+ if km.currentKey == key {
176
+ km.currentKey = ""
177
+ km.currentSession = ""
178
+ }
179
+ }
180
+ }
181
+
182
+ // GetSession 获取或创建会话
183
+ func (km *KeyManager) GetSession(ctx context.Context, apikey string) (string, error) {
184
+ km.mu.Lock()
185
+ defer km.mu.Unlock()
186
+
187
+ if km.currentSession == "" {
188
+ session, err := createSession(ctx, apikey, "", nil)
189
+ if err != nil {
190
+ log.Printf("【创建会话失败】错误: %v", err)
191
+ return "", err
192
+ }
193
+ km.currentSession = session
194
+ log.Printf("【创建新会话】SESSION ID: %s", km.currentSession)
195
+ }
196
+
197
+ km.lastUsedTime = time.Now()
198
+ return km.currentSession, nil
199
+ }
200
+
201
+ var keyManager *KeyManager
202
+
203
+ // HTTP请求结构
204
+ type ChatCompletionRequest struct {
205
+ Messages []Message `json:"messages"`
206
+ Model string `json:"model"`
207
+ Stream bool `json:"stream"`
208
+ }
209
+
210
+ type Message struct {
211
+ Role string `json:"role"`
212
+ Content string `json:"content"`
213
+ }
214
+
215
+ type ChatCompletionResponse struct {
216
+ ID string `json:"id"`
217
+ Object string `json:"object"`
218
+ Created int64 `json:"created"`
219
+ Model string `json:"model"`
220
+ Choices []Choice `json:"choices"`
221
+ Usage Usage `json:"usage"`
222
+ }
223
+
224
+ type Choice struct {
225
+ Index int `json:"index"`
226
+ Message *Message `json:"message,omitempty"`
227
+ Delta *Message `json:"delta,omitempty"`
228
+ FinishReason *string `json:"finish_reason"`
229
+ }
230
+
231
+ type Usage struct{}
232
+
233
+ type ModelsResponse struct {
234
+ Object string `json:"object"`
235
+ Data []Model `json:"data"`
236
+ }
237
+
238
+ type Model struct {
239
+ ID string `json:"id"`
240
+ Object string `json:"object"`
241
+ OwnedBy string `json:"owned_by"`
242
+ }
243
+
244
+ // OnDemand API 结构
245
+ type CreateSessionRequest struct {
246
+ ExternalUserID string `json:"externalUserId"`
247
+ PluginIds []string `json:"pluginIds,omitempty"`
248
+ }
249
+
250
+ type CreateSessionResponse struct {
251
+ Data struct {
252
+ ID string `json:"id"`
253
+ } `json:"data"`
254
+ }
255
+
256
+ type QueryRequest struct {
257
+ Query string `json:"query"`
258
+ EndpointID string `json:"endpointId"`
259
+ PluginIds []string `json:"pluginIds"`
260
+ ResponseMode string `json:"responseMode"`
261
+ }
262
+
263
+ type QueryResponse struct {
264
+ Data struct {
265
+ Answer string `json:"answer"`
266
+ } `json:"data"`
267
+ }
268
+
269
+ // 初始化配置
270
+ func init() {
271
+ // 加载 .env 文件
272
+ err := godotenv.Load()
273
+ if err != nil {
274
+ log.Println("警告:没有找到 .env 文件,将仅使用系统环境变量")
275
+ }
276
+ initConfig()
277
+ }
278
+
279
+ func initConfig() {
280
+ privateKey = getEnv("PRIVATE_KEY", "testofli")
281
+
282
+ apiKeysStr := os.Getenv("ONDEMAND_APIKEYS")
283
+ if apiKeysStr != "" {
284
+ ondemandAPIKeys = strings.Split(apiKeysStr, ",")
285
+ }
286
+
287
+ if len(ondemandAPIKeys) == 0 && !isTestMode() {
288
+ log.Fatal("ONDEMAND_APIKEYS 环境变量为空,请设置API密钥")
289
+ }
290
+
291
+ if len(ondemandAPIKeys) > 0 {
292
+ keyManager = NewKeyManager(ondemandAPIKeys)
293
+ }
294
+ }
295
+
296
+ func isTestMode() bool {
297
+ for _, arg := range os.Args {
298
+ if strings.Contains(arg, "test") {
299
+ return true
300
+ }
301
+ }
302
+ return os.Getenv("GIN_MODE") == "test"
303
+ }
304
+
305
+ func getEnv(key, defaultValue string) string {
306
+ if value := os.Getenv(key); value != "" {
307
+ return value
308
+ }
309
+ return defaultValue
310
+ }
311
+
312
+ // 权限检查中间件
313
+ func checkPrivateKey() gin.HandlerFunc {
314
+ return func(c *gin.Context) {
315
+ // 放宽部分接口
316
+ if c.Request.URL.Path == "/" || c.Request.URL.Path == "/favicon.ico" {
317
+ c.Next()
318
+ return
319
+ }
320
+
321
+ var key string
322
+ for _, header := range safeHeaders {
323
+ if value := c.GetHeader(header); value != "" {
324
+ key = value
325
+ if header == "Authorization" && strings.HasPrefix(value, "Bearer ") {
326
+ key = strings.TrimSpace(value[7:])
327
+ }
328
+ break
329
+ }
330
+ }
331
+
332
+ if key == "" || key != privateKey {
333
+ c.JSON(http.StatusUnauthorized, gin.H{
334
+ "error": "Unauthorized, must provide correct Authorization or X-API-KEY",
335
+ "headers": c.Request.Header,
336
+ })
337
+ c.Abort()
338
+ return
339
+ }
340
+
341
+ c.Next()
342
+ }
343
+ }
344
+
345
+ // 获取端点ID
346
+ func getEndpointID(openaiModel string) string {
347
+ model := strings.ToLower(strings.ReplaceAll(openaiModel, " ", ""))
348
+ if endpoint, exists := modelMap[model]; exists {
349
+ return endpoint
350
+ }
351
+ return defaultModel
352
+ }
353
+
354
+ // 创建会话
355
+ func createSession(ctx context.Context, apikey, externalUserID string, pluginIds []string) (string, error) {
356
+ if externalUserID == "" {
357
+ externalUserID = uuid.New().String()
358
+ }
359
+
360
+ payload := CreateSessionRequest{
361
+ ExternalUserID: externalUserID,
362
+ PluginIds: pluginIds,
363
+ }
364
+
365
+ jsonData, err := json.Marshal(payload)
366
+ if err != nil {
367
+ return "", err
368
+ }
369
+
370
+ req, err := http.NewRequestWithContext(ctx, "POST", ondemandAPIBase+"/sessions", bytes.NewBuffer(jsonData))
371
+ if err != nil {
372
+ return "", err
373
+ }
374
+
375
+ req.Header.Set("apikey", apikey)
376
+ req.Header.Set("Content-Type", "application/json")
377
+
378
+ client := &http.Client{Timeout: 20 * time.Second}
379
+ resp, err := client.Do(req)
380
+ if err != nil {
381
+ return "", err
382
+ }
383
+ defer resp.Body.Close()
384
+
385
+ if resp.StatusCode != http.StatusOK && resp.StatusCode != http.StatusCreated {
386
+ return "", fmt.Errorf("create session failed with status: %d", resp.StatusCode)
387
+ }
388
+
389
+ var sessionResp CreateSessionResponse
390
+ if err := json.NewDecoder(resp.Body).Decode(&sessionResp); err != nil {
391
+ return "", err
392
+ }
393
+
394
+ return sessionResp.Data.ID, nil
395
+ }
396
+
397
+ // 执行带重试的操作
398
+ func withValidKey(ctx context.Context, fn func(ctx context.Context, key string) (interface{}, error)) (interface{}, error) {
399
+ badCount := 0
400
+ maxRetry := len(keyManager.keyList) * 2
401
+
402
+ for badCount < maxRetry {
403
+ key := keyManager.Get()
404
+ result, err := fn(ctx, key)
405
+
406
+ if err != nil {
407
+ // 检查是否是需要标记密钥为坏的错误
408
+ if isAuthError(err) {
409
+ keyManager.MarkBad(key)
410
+ badCount++
411
+ continue
412
+ }
413
+ return nil, err
414
+ }
415
+
416
+ return result, nil
417
+ }
418
+
419
+ return nil, fmt.Errorf("没有可用API KEY,请补充新KEY或联系技术支持")
420
+ }
421
+
422
+ // 检查是否是认证相关错误
423
+ func isAuthError(err error) bool {
424
+ errStr := err.Error()
425
+ return strings.Contains(errStr, "401") ||
426
+ strings.Contains(errStr, "403") ||
427
+ strings.Contains(errStr, "429") ||
428
+ strings.Contains(errStr, "500")
429
+ }
430
+
431
+ // 聊天完成接口
432
+ func chatCompletions(c *gin.Context) {
433
+ var req ChatCompletionRequest
434
+ if err := c.ShouldBindJSON(&req); err != nil {
435
+ c.JSON(http.StatusBadRequest, gin.H{"error": "请求缺少messages字段"})
436
+ return
437
+ }
438
+
439
+ if len(req.Messages) == 0 {
440
+ c.JSON(http.StatusBadRequest, gin.H{"error": "请求缺少messages字段"})
441
+ return
442
+ }
443
+
444
+ // 获取用户消息
445
+ var userMsg string
446
+ for i := len(req.Messages) - 1; i >= 0; i-- {
447
+ if req.Messages[i].Role == "user" {
448
+ userMsg = req.Messages[i].Content
449
+ break
450
+ }
451
+ }
452
+
453
+ if userMsg == "" {
454
+ c.JSON(http.StatusBadRequest, gin.H{"error": "未找到用户消息"})
455
+ return
456
+ }
457
+
458
+ endpointID := getEndpointID(req.Model)
459
+
460
+ // 添加模型和端点的日志记录
461
+ log.Printf("【模型请求】模型: %s, 端点: %s, 流式: %t", req.Model, endpointID, req.Stream)
462
+
463
+ if req.Stream {
464
+ handleStreamResponse(c, userMsg, endpointID, req.Model)
465
+ } else {
466
+ handleNonStreamResponse(c, userMsg, endpointID, req.Model)
467
+ }
468
+ }
469
+
470
+ // 处理流式响应
471
+ func handleStreamResponse(c *gin.Context, userMsg, endpointID, model string) {
472
+ c.Header("Content-Type", "text/event-stream")
473
+ c.Header("Cache-Control", "no-cache")
474
+ c.Header("Connection", "keep-alive")
475
+
476
+ // 使用channel进行异步处理
477
+ resultChan := make(chan string, 100)
478
+ errorChan := make(chan error, 1)
479
+
480
+ go func() {
481
+ defer close(resultChan)
482
+ defer close(errorChan)
483
+
484
+ ctx := context.Background()
485
+ result, err := withValidKey(ctx, func(ctx context.Context, apikey string) (interface{}, error) {
486
+ return streamQuery(ctx, apikey, userMsg, endpointID, model, resultChan)
487
+ })
488
+
489
+ if err != nil {
490
+ errorChan <- err
491
+ return
492
+ }
493
+
494
+ _ = result // 流式响应的结果通过channel传递
495
+ }()
496
+
497
+ // 处理响应流
498
+ for {
499
+ select {
500
+ case chunk, ok := <-resultChan:
501
+ if !ok {
502
+ return
503
+ }
504
+ if chunk == "data: [DONE]" {
505
+ _, _ = fmt.Fprintf(c.Writer, "data: [DONE]\n\n")
506
+ c.Writer.Flush()
507
+ return
508
+ }
509
+ _, _ = fmt.Fprintf(c.Writer, "data: %s\n\n", chunk)
510
+ c.Writer.Flush()
511
+ case err := <-errorChan:
512
+ if err != nil {
513
+ errorData := map[string]any{"error": err.Error()}
514
+ errorJSON, _ := json.Marshal(errorData)
515
+ _, _ = fmt.Fprintf(c.Writer, "data: %s\n\n", string(errorJSON))
516
+ c.Writer.Flush()
517
+ }
518
+ return
519
+ case <-c.Request.Context().Done():
520
+ return
521
+ }
522
+ }
523
+ }
524
+
525
+ // 流式查询
526
+ func streamQuery(ctx context.Context, apikey, userMsg, endpointID, model string, resultChan chan<- string) (interface{}, error) {
527
+ sessionID, err := keyManager.GetSession(ctx, apikey)
528
+ if err != nil {
529
+ return nil, err
530
+ }
531
+
532
+ payload := QueryRequest{
533
+ Query: userMsg,
534
+ EndpointID: endpointID,
535
+ PluginIds: []string{},
536
+ ResponseMode: "stream",
537
+ }
538
+
539
+ jsonData, err := json.Marshal(payload)
540
+ if err != nil {
541
+ return nil, err
542
+ }
543
+
544
+ req, err := http.NewRequestWithContext(ctx, "POST",
545
+ fmt.Sprintf("%s/sessions/%s/query", ondemandAPIBase, sessionID),
546
+ bytes.NewBuffer(jsonData))
547
+ if err != nil {
548
+ return nil, err
549
+ }
550
+
551
+ req.Header.Set("apikey", apikey)
552
+ req.Header.Set("Content-Type", "application/json")
553
+ req.Header.Set("Accept", "text/event-stream")
554
+
555
+ client := &http.Client{Timeout: 300 * time.Second}
556
+ resp, err := client.Do(req)
557
+ if err != nil {
558
+ return nil, err
559
+ }
560
+ defer resp.Body.Close()
561
+
562
+ if resp.StatusCode != http.StatusOK {
563
+ return nil, fmt.Errorf("stream query failed with status: %d", resp.StatusCode)
564
+ }
565
+
566
+ scanner := bufio.NewScanner(resp.Body)
567
+ firstChunk := true
568
+
569
+ for scanner.Scan() {
570
+ line := scanner.Text()
571
+ if !strings.HasPrefix(line, "data:") {
572
+ continue
573
+ }
574
+
575
+ dataPart := strings.TrimSpace(line[5:])
576
+ if dataPart == "[DONE]" {
577
+ resultChan <- "data: [DONE]"
578
+ break
579
+ }
580
+
581
+ if strings.HasPrefix(dataPart, "[ERROR]:") {
582
+ errJSON := strings.TrimSpace(dataPart[8:])
583
+ resultChan <- fmt.Sprintf(`{"error": "%s"}`, errJSON)
584
+ break
585
+ }
586
+
587
+ var eventData map[string]any
588
+ if err := json.Unmarshal([]byte(dataPart), &eventData); err != nil {
589
+ continue
590
+ }
591
+
592
+ // 处理不同类型的事件
593
+ if eventType, ok := eventData["eventType"].(string); ok {
594
+ var content string
595
+ var hasContent bool
596
+
597
+ switch eventType {
598
+ case "fulfillment":
599
+ if answer, ok := eventData["answer"].(string); ok {
600
+ content = answer
601
+ hasContent = true
602
+ }
603
+ case "stream", "thinking", "reasoning", "thoughts": // 可能的思考过程事件类型
604
+ if answer, ok := eventData["answer"].(string); ok {
605
+ content = answer
606
+ hasContent = true
607
+ } else if text, ok := eventData["text"].(string); ok {
608
+ content = text
609
+ hasContent = true
610
+ } else if data, ok := eventData["data"].(string); ok {
611
+ content = data
612
+ hasContent = true
613
+ } else if thoughts, ok := eventData["thoughts"].(string); ok {
614
+ content = thoughts
615
+ hasContent = true
616
+ }
617
+ default:
618
+ // 对于未知事件类型,尝试提取任何文本内容
619
+ if answer, ok := eventData["answer"].(string); ok {
620
+ content = answer
621
+ hasContent = true
622
+ } else if text, ok := eventData["text"].(string); ok {
623
+ content = text
624
+ hasContent = true
625
+ } else if thoughts, ok := eventData["thoughts"].(string); ok {
626
+ content = thoughts
627
+ hasContent = true
628
+ }
629
+ }
630
+
631
+ if hasContent {
632
+ chunk := ChatCompletionResponse{
633
+ ID: "chatcmpl-" + uuid.New().String()[:8],
634
+ Object: "chat.completion.chunk",
635
+ Created: time.Now().Unix(),
636
+ Model: model,
637
+ Choices: []Choice{{
638
+ Index: 0,
639
+ Delta: &Message{
640
+ Role: func() string {
641
+ if firstChunk {
642
+ return "assistant"
643
+ } else {
644
+ return ""
645
+ }
646
+ }(),
647
+ Content: content,
648
+ },
649
+ FinishReason: nil,
650
+ }},
651
+ }
652
+
653
+ chunkJSON, _ := json.Marshal(chunk)
654
+ resultChan <- string(chunkJSON)
655
+ firstChunk = false
656
+ }
657
+ }
658
+ }
659
+
660
+ if err := scanner.Err(); err != nil {
661
+ return nil, err
662
+ }
663
+
664
+ return nil, nil
665
+ }
666
+
667
+ // 处理非流式响应
668
+ func handleNonStreamResponse(c *gin.Context, userMsg, endpointID, model string) {
669
+ ctx := c.Request.Context()
670
+
671
+ result, err := withValidKey(ctx, func(ctx context.Context, apikey string) (any, error) {
672
+ return nonStreamQuery(ctx, apikey, userMsg, endpointID, model)
673
+ })
674
+
675
+ if err != nil {
676
+ c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()})
677
+ return
678
+ }
679
+
680
+ c.JSON(http.StatusOK, result)
681
+ }
682
+
683
+ // 非流式查询
684
+ func nonStreamQuery(ctx context.Context, apikey, userMsg, endpointID, model string) (any, error) {
685
+ sessionID, err := keyManager.GetSession(ctx, apikey)
686
+ if err != nil {
687
+ return nil, err
688
+ }
689
+
690
+ payload := QueryRequest{
691
+ Query: userMsg,
692
+ EndpointID: endpointID,
693
+ PluginIds: []string{},
694
+ ResponseMode: "sync",
695
+ }
696
+
697
+ jsonData, err := json.Marshal(payload)
698
+ if err != nil {
699
+ return nil, err
700
+ }
701
+
702
+ req, err := http.NewRequestWithContext(ctx, "POST",
703
+ fmt.Sprintf("%s/sessions/%s/query", ondemandAPIBase, sessionID),
704
+ bytes.NewBuffer(jsonData))
705
+ if err != nil {
706
+ return nil, err
707
+ }
708
+
709
+ req.Header.Set("apikey", apikey)
710
+ req.Header.Set("Content-Type", "application/json")
711
+
712
+ client := &http.Client{Timeout: 300 * time.Second}
713
+ resp, err := client.Do(req)
714
+ if err != nil {
715
+ return nil, err
716
+ }
717
+ defer resp.Body.Close()
718
+
719
+ if resp.StatusCode != http.StatusOK {
720
+ return nil, fmt.Errorf("non-stream query failed with status: %d", resp.StatusCode)
721
+ }
722
+
723
+ var queryResp QueryResponse
724
+ if err := json.NewDecoder(resp.Body).Decode(&queryResp); err != nil {
725
+ return nil, err
726
+ }
727
+
728
+ content := queryResp.Data.Answer
729
+
730
+ response := ChatCompletionResponse{
731
+ ID: "chatcmpl-" + uuid.New().String()[:8],
732
+ Object: "chat.completion",
733
+ Created: time.Now().Unix(),
734
+ Model: model,
735
+ Choices: []Choice{{
736
+ Index: 0,
737
+ Message: &Message{
738
+ Role: "assistant",
739
+ Content: content,
740
+ },
741
+ FinishReason: func() *string { s := "stop"; return &s }(),
742
+ }},
743
+ Usage: Usage{},
744
+ }
745
+
746
+ return response, nil
747
+ }
748
+
749
+ // 模型列表接口
750
+ func models(c *gin.Context) {
751
+ var modelList []Model
752
+ for modelID := range modelMap {
753
+ modelList = append(modelList, Model{
754
+ ID: modelID,
755
+ Object: "model",
756
+ OwnedBy: "ondemand-proxy",
757
+ })
758
+ }
759
+
760
+ response := ModelsResponse{
761
+ Object: "list",
762
+ Data: modelList,
763
+ }
764
+
765
+ c.JSON(http.StatusOK, response)
766
+ }
767
+
768
+ // 健康检查接口
769
+ func health(c *gin.Context) {
770
+ c.JSON(http.StatusOK, gin.H{
771
+ "status": "ok",
772
+ "keys": len(ondemandAPIKeys),
773
+ })
774
+ }
775
+
776
+ func main() {
777
+ // 设置日志格式
778
+ log.SetFlags(log.LstdFlags | log.Lshortfile)
779
+
780
+ // 设置Gin模式
781
+ if os.Getenv("GIN_MODE") == "" {
782
+ gin.SetMode(gin.ReleaseMode)
783
+ }
784
+
785
+ router := gin.New()
786
+
787
+ // 中间件
788
+ router.Use(gin.Logger())
789
+ router.Use(gin.Recovery())
790
+ router.Use(checkPrivateKey())
791
+
792
+ // 路由
793
+ router.GET("/", health)
794
+ router.POST("/v1/chat/completions", chatCompletions)
795
+ router.GET("/v1/models", models)
796
+
797
+ // 获取端口
798
+ port := DefaultPort
799
+ if portStr := os.Getenv("PORT"); portStr != "" {
800
+ if p, err := strconv.Atoi(portStr); err == nil {
801
+ port = p
802
+ }
803
+ }
804
+
805
+ log.Printf("======== OnDemand KEY池数量:%d ========", len(ondemandAPIKeys))
806
+ log.Printf("服务器启动在端口:%d", port)
807
+
808
+ // 启动服务器
809
+ if err := router.Run(fmt.Sprintf(":%d", port)); err != nil {
810
+ log.Fatal("启动服务器失败:", err)
811
+ }
812
+ }
openai_ondemand_adapter.py DELETED
@@ -1,325 +0,0 @@
1
- from flask import Flask, request, Response, jsonify
2
- import requests
3
- import uuid
4
- import time
5
- import json
6
- import threading
7
- import logging
8
- import os
9
-
10
- # ====== 读取 Huggingface Secret 配置的私有key =======
11
- PRIVATE_KEY = os.environ.get("PRIVATE_KEY", "") or "testofli"
12
- SAFE_HEADERS = ["Authorization", "X-API-KEY"]
13
-
14
- # 全局接口访问权限检查
15
- def check_private_key():
16
-
17
-
18
- # 可以在这里放宽部分接口,比如首页等
19
- if request.path in ["/", "/favicon.ico"]:
20
- return
21
- key = None
22
- for header in SAFE_HEADERS:
23
- key = request.headers.get(header)
24
- if key:
25
- if header == "Authorization" and key.startswith("Bearer "):
26
- key = key[len("Bearer "):].strip()
27
- break
28
- if not key or key != PRIVATE_KEY:
29
- return jsonify({
30
- "error": "Unauthorized, must provide correct Authorization or X-API-KEY",
31
- "headers": dict(request.headers)
32
- }), 401
33
-
34
- # 应用所有API鉴权
35
- app = Flask(__name__)
36
- app.before_request(check_private_key)
37
-
38
- # ========== KEY池(每行一个)==========
39
- ONDEMAND_APIKEYS = os.environ.get("ONDEMAND_APIKEYS", "").split(",") if os.environ.get("ONDEMAND_APIKEYS") else []
40
- BAD_KEY_RETRY_INTERVAL = 600 # 秒
41
- SESSION_TIMEOUT = 600 # 对话超时时间(10分钟)
42
-
43
- # ========== OnDemand模型映射 ==========
44
- MODEL_MAP = {
45
- "o3": "predefined-openai-gpto3",
46
- "o3-mini":"predefined-openai-gpto3-mini",
47
- "o4-mini":"predefined-openai-gpto4-mini",
48
- "gpt-4o": "predefined-openai-gpt4o",
49
- "gpt-4.1": "predefined-openai-gpt4.1",
50
- "gpt-4.1-mini": "predefined-openai-gpt4.1-mini",
51
- "gpt-4o-mini": "predefined-openai-gpt4o-mini",
52
- "deepseek-v3": "predefined-deepseek-v3",
53
- "deepseek-r1": "predefined-deepseek-r1",
54
-
55
- "gemini-2.5-pro":"predefined-gemini-2.5-pro-preview",
56
- "gemini-2.5-flash":"predefined-gemini-2.5-flash",
57
- "claude-4-sonnet": "predefined-claude-4-sonnet",
58
- "claude-4-opus": "predefined-claude-4-opus"
59
- }
60
- DEFAULT_ONDEMAND_MODEL = "predefined-openai-gpt4o"
61
- # ==========================================
62
-
63
- class KeyManager:
64
- def __init__(self, key_list):
65
- self.key_list = list(key_list)
66
- self.lock = threading.Lock()
67
- self.key_status = {k: {"bad": False, "bad_ts": None} for k in self.key_list}
68
- self.idx = 0
69
- # 新增:当前正在使用的key和session
70
- self.current_key = None
71
- self.current_session = None
72
- self.last_used_time = None
73
-
74
- def display_key(self, key):
75
- return f"{key[:6]}...{key[-4:]}"
76
-
77
- def get(self):
78
- with self.lock:
79
- now = time.time()
80
- # 检查对话是否超时
81
- if self.current_key and self.last_used_time and (now - self.last_used_time > SESSION_TIMEOUT):
82
- print(f"【对话超时】上次使用时间: {time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(self.last_used_time))}")
83
- print(f"【对话超时】当前时间: {time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(now))}")
84
- print(f"【对话超时】超时{SESSION_TIMEOUT//60}分钟,切换新会话")
85
- self.current_key = None
86
- self.current_session = None
87
-
88
- # 如果已有正在使用的key,继续使用
89
- if self.current_key:
90
- if not self.key_status[self.current_key]["bad"]:
91
- print(f"【对话请求】【继续使用API KEY: {self.display_key(self.current_key)}】【状态:正常】")
92
- self.last_used_time = now
93
- return self.current_key
94
- else:
95
- # 当前key已标记为异常,需要切换
96
- self.current_key = None
97
- self.current_session = None
98
-
99
- # 如果没有当前key或当前key无效,选择新的key
100
- total = len(self.key_list)
101
- for _ in range(total):
102
- key = self.key_list[self.idx]
103
- self.idx = (self.idx + 1) % total
104
- s = self.key_status[key]
105
- if not s["bad"]:
106
- print(f"【对话请求】【使用新API KEY: {self.display_key(key)}】【状态:正常】")
107
- self.current_key = key
108
- self.current_session = None # 强制创建新会话
109
- self.last_used_time = now
110
- return key
111
- if s["bad"] and s["bad_ts"]:
112
- ago = now - s["bad_ts"]
113
- if ago >= BAD_KEY_RETRY_INTERVAL:
114
- print(f"【KEY自动尝试恢复】API KEY: {self.display_key(key)} 满足重试周期,标记为正常")
115
- self.key_status[key]["bad"] = False
116
- self.key_status[key]["bad_ts"] = None
117
- self.current_key = key
118
- self.current_session = None # 强制创建新会话
119
- self.last_used_time = now
120
- return key
121
-
122
- print("【警告】全部KEY已被禁用,强制选用第一个KEY继续尝试:", self.display_key(self.key_list[0]))
123
- for k in self.key_list:
124
- self.key_status[k]["bad"] = False
125
- self.key_status[k]["bad_ts"] = None
126
- self.idx = 0
127
- self.current_key = self.key_list[0]
128
- self.current_session = None # 强制创建新会话
129
- self.last_used_time = now
130
- print(f"【对话请求】【使用API KEY: {self.display_key(self.current_key)}】【状态:强制尝试(全部异常)】")
131
- return self.current_key
132
-
133
- def mark_bad(self, key):
134
- with self.lock:
135
- if key in self.key_status and not self.key_status[key]["bad"]:
136
- print(f"【禁用KEY】API KEY: {self.display_key(key)},接口返回无效(将在{BAD_KEY_RETRY_INTERVAL//60}分钟后自动重试)")
137
- self.key_status[key]["bad"] = True
138
- self.key_status[key]["bad_ts"] = time.time()
139
- if self.current_key == key:
140
- self.current_key = None
141
- self.current_session = None
142
-
143
- def get_session(self, apikey):
144
- with self.lock:
145
- if not self.current_session:
146
- try:
147
- self.current_session = create_session(apikey)
148
- print(f"【创建新会话】SESSION ID: {self.current_session}")
149
- except Exception as e:
150
- print(f"【创建会话失败】错误: {str(e)}")
151
- raise
152
- self.last_used_time = time.time()
153
- return self.current_session
154
-
155
- keymgr = KeyManager(ONDEMAND_APIKEYS)
156
-
157
- ONDEMAND_API_BASE = "https://api.on-demand.io/chat/v1"
158
-
159
- def get_endpoint_id(openai_model):
160
- m = str(openai_model or "").lower().replace(" ", "")
161
- return MODEL_MAP.get(m, DEFAULT_ONDEMAND_MODEL)
162
-
163
- def create_session(apikey, external_user_id=None, plugin_ids=None):
164
- url = f"{ONDEMAND_API_BASE}/sessions"
165
- payload = {"externalUserId": external_user_id or str(uuid.uuid4())}
166
- if plugin_ids is not None:
167
- payload["pluginIds"] = plugin_ids
168
- headers = {"apikey": apikey, "Content-Type": "application/json"}
169
- resp = requests.post(url, json=payload, headers=headers, timeout=20)
170
- resp.raise_for_status()
171
- return resp.json()["data"]["id"]
172
-
173
- def format_openai_sse_delta(chunk_str):
174
- return f"data: {json.dumps(chunk_str, ensure_ascii=False)}\n\n"
175
-
176
- @app.route("/v1/chat/completions", methods=["POST"])
177
- def chat_completions():
178
- data = request.json
179
- if not data or "messages" not in data:
180
- return jsonify({"error": "请求缺少messages字段"}), 400
181
-
182
- messages = data["messages"]
183
- openai_model = data.get("model", "gpt-4o")
184
- endpoint_id = get_endpoint_id(openai_model)
185
- is_stream = bool(data.get("stream", False))
186
-
187
- user_msg = None
188
- for msg in reversed(messages):
189
- if msg.get("role") == "user":
190
- user_msg = msg.get("content")
191
- break
192
- if user_msg is None:
193
- return jsonify({"error": "未找到用户消息"}), 400
194
-
195
- def with_valid_key(func):
196
- bad_cnt = 0
197
- max_retry = len(keymgr.key_list)*2
198
- while bad_cnt < max_retry:
199
- key = keymgr.get()
200
- try:
201
- return func(key)
202
- except Exception as e:
203
- if hasattr(e, 'response'):
204
- r = e.response
205
- if r.status_code in (401, 403, 429, 500):
206
- keymgr.mark_bad(key)
207
- bad_cnt += 1
208
- continue
209
- raise
210
- return jsonify({"error": "没有可用API KEY,请补充新KEY或联系技术支持"}), 500
211
-
212
- if is_stream:
213
- def generate():
214
- def do_once(apikey):
215
- # 使用KeyManager获取或创建session
216
- sid = keymgr.get_session(apikey)
217
- url = f"{ONDEMAND_API_BASE}/sessions/{sid}/query"
218
- payload = {
219
- "query": user_msg,
220
- "endpointId": endpoint_id,
221
- "pluginIds": [],
222
- "responseMode": "stream"
223
- }
224
- headers = {"apikey": apikey, "Content-Type": "application/json", "Accept": "text/event-stream"}
225
- with requests.post(url, json=payload, headers=headers, stream=True, timeout=120) as resp:
226
- if resp.status_code != 200:
227
- raise requests.HTTPError(response=resp)
228
- answer_acc = ""
229
- first_chunk = True
230
- for line in resp.iter_lines():
231
- if not line:
232
- continue
233
- line = line.decode("utf-8")
234
- if line.startswith("data:"):
235
- datapart = line[5:].strip()
236
- if datapart == "[DONE]":
237
- yield "data: [DONE]\n\n"
238
- break
239
- elif datapart.startswith("[ERROR]:"):
240
- err_json = datapart[len("[ERROR]:"):].strip()
241
- yield format_openai_sse_delta({"error": err_json})
242
- break
243
- else:
244
- try:
245
- js = json.loads(datapart)
246
- except Exception:
247
- continue
248
- if js.get("eventType") == "fulfillment":
249
- delta = js.get("answer", "")
250
- answer_acc += delta
251
- chunk = {
252
- "id": "chatcmpl-" + str(uuid.uuid4())[:8],
253
- "object": "chat.completion.chunk",
254
- "created": int(time.time()),
255
- "model": openai_model,
256
- "choices": [{
257
- "delta": {
258
- "role": "assistant",
259
- "content": delta
260
- } if first_chunk else {
261
- "content": delta
262
- },
263
- "index": 0,
264
- "finish_reason": None
265
- }]
266
- }
267
- yield format_openai_sse_delta(chunk)
268
- first_chunk = False
269
- yield "data: [DONE]\n\n"
270
- yield from with_valid_key(do_once)
271
- return Response(generate(), content_type='text/event-stream')
272
-
273
- def nonstream(apikey):
274
- # 使用KeyManager获取或创建session
275
- sid = keymgr.get_session(apikey)
276
- url = f"{ONDEMAND_API_BASE}/sessions/{sid}/query"
277
- payload = {
278
- "query": user_msg,
279
- "endpointId": endpoint_id,
280
- "pluginIds": [],
281
- "responseMode": "sync"
282
- }
283
- headers = {"apikey": apikey, "Content-Type": "application/json"}
284
- resp = requests.post(url, json=payload, headers=headers, timeout=120)
285
- if resp.status_code != 200:
286
- raise requests.HTTPError(response=resp)
287
- ai_response = resp.json()["data"]["answer"]
288
- resp_obj = {
289
- "id": "chatcmpl-" + str(uuid.uuid4())[:8],
290
- "object": "chat.completion",
291
- "created": int(time.time()),
292
- "model": openai_model,
293
- "choices": [
294
- {
295
- "index": 0,
296
- "message": {"role": "assistant", "content": ai_response},
297
- "finish_reason": "stop"
298
- }
299
- ],
300
- "usage": {}
301
- }
302
- return jsonify(resp_obj)
303
-
304
- return with_valid_key(nonstream)
305
-
306
- @app.route("/v1/models", methods=["GET"])
307
- def models():
308
- model_objs = []
309
- for mdl in MODEL_MAP.keys():
310
- model_objs.append({
311
- "id": mdl,
312
- "object": "model",
313
- "owned_by": "ondemand-proxy"
314
- })
315
- uniq = {m["id"]: m for m in model_objs}.values()
316
- return jsonify({
317
- "object": "list",
318
- "data": list(uniq)
319
- })
320
-
321
- if __name__ == "__main__":
322
- log_fmt = '[%(asctime)s] %(levelname)s: %(message)s'
323
- logging.basicConfig(level=logging.INFO, format=log_fmt)
324
- print("======== OnDemand KEY池数量:", len(ONDEMAND_APIKEYS), "========")
325
- app.run(host="0.0.0.0", port=7860, debug=False)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
requirements.txt DELETED
@@ -1,3 +0,0 @@
1
- Flask
2
- requests
3
- gunicorn