ZyphrZero commited on
Commit
4e4ca52
·
1 Parent(s): b5594ae

feat: 添加 Anthropic API 支持并进行模块化重构

Browse files

主要变更:
- 新增 Anthropic API 兼容端点 (/v1/messages)
- 将单体 main.py 拆分为模块化架构 (app/*)
- 遵循 FastAPI 最佳实践,提高代码可维护性
- 更新文档和配置说明

技术细节:
- app/api/anthropic.py: Anthropic API 兼容层
- app/api/openai.py: OpenAI API 端点
- app/core/: 配置和响应处理核心模块
- app/models/: Pydantic 数据模型
- app/utils/: 工具函数和 SSE 解析器
- .env.example: 环境配置示例文件

BREAKING CHANGE: main.py 从 1446 行精简为 46 行,仅作为应用入口

.env.example ADDED
@@ -0,0 +1,92 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Z.AI API 代理服务配置文件示例
2
+ # 复制此文件为 .env 并根据需要修改配置值
3
+
4
+ # =============================================================================
5
+ # API 认证配置
6
+ # =============================================================================
7
+
8
+ # 客户端认证密钥(OpenAI 和 Anthropic 共用)
9
+ # 客户端调用时需要使用此密钥进行认证
10
+ AUTH_TOKEN=sk-your-api-key
11
+
12
+ # Anthropic API 客户端认证密钥(可选)
13
+ # 如果未设置,将使用 AUTH_TOKEN 的值
14
+ # ANTHROPIC_API_KEY=sk-your-api-key
15
+
16
+ # 备用认证令牌(匿名模式失败时使用)
17
+ BACKUP_TOKEN=eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6IjMxNmJjYjQ4LWZmMmYtNGExNS04NTNkLWYyYTI5YjY3ZmYwZiIsImVtYWlsIjoiR3Vlc3QtMTc1NTg0ODU4ODc4OEBndWVzdC5jb20ifQ.PktllDySS3trlyuFpTeIZf-7hl8Qu1qYF3BxjgIul0BrNux2nX9hVzIjthLXKMWAf9V0qM8Vm_iyDqkjPGsaiQ
18
+
19
+ # =============================================================================
20
+ # 上游服务配置
21
+ # =============================================================================
22
+
23
+ # Z.AI 上游 API 地址
24
+ API_ENDPOINT=https://chat.z.ai/api/chat/completions
25
+
26
+ # =============================================================================
27
+ # 模型配置
28
+ # =============================================================================
29
+
30
+ # 默认模型名称
31
+ PRIMARY_MODEL=GLM-4.5
32
+
33
+ # 思考模式模型名称
34
+ THINKING_MODEL=GLM-4.5-Thinking
35
+
36
+ # 搜索模式模型名称
37
+ SEARCH_MODEL=GLM-4.5-Search
38
+
39
+ # =============================================================================
40
+ # 服务器配置
41
+ # =============================================================================
42
+
43
+ # 服务监听端口
44
+ LISTEN_PORT=8080
45
+
46
+ # 调试日志开关 (true/false)
47
+ DEBUG_LOGGING=true
48
+
49
+ # =============================================================================
50
+ # 功能配置
51
+ # =============================================================================
52
+
53
+ # 思考内容处理策略
54
+ # think: 转换为 <span> 标签
55
+ # strip: 移除思考内容
56
+ # raw: 保留原始格式
57
+ THINKING_PROCESSING=think
58
+
59
+ # 匿名模式开关 (true/false)
60
+ # 开启后将使用临时 token,避免对话历史共享
61
+ ANONYMOUS_MODE=true
62
+
63
+ # Function Call 功能开关 (true/false)
64
+ TOOL_SUPPORT=true
65
+
66
+ # 工具调用扫描限制(字符数)
67
+ SCAN_LIMIT=200000
68
+
69
+ # =============================================================================
70
+ # 使用说明
71
+ # =============================================================================
72
+ #
73
+ # 1. 复制此文件:
74
+ # cp .env.example .env
75
+ #
76
+ # 2. 根据需要修改配置值
77
+ #
78
+ # 3. 启动服务:
79
+ # python main.py
80
+ #
81
+ # 4. OpenAI 客户端示例:
82
+ # client = openai.OpenAI(
83
+ # base_url="http://localhost:8080/v1",
84
+ # api_key="your-auth-token" # 使用 AUTH_TOKEN 的值
85
+ # )
86
+ #
87
+ # 5. Anthropic 客户端示例:
88
+ # client = anthropic.Anthropic(
89
+ # base_url="http://localhost:8080/v1",
90
+ # api_key="your-auth-token" # 使用 AUTH_TOKEN 的值(或单独配置的 ANTHROPIC_API_KEY)
91
+ # )
92
+ #
README.md CHANGED
@@ -1,14 +1,15 @@
1
- # Z.AI OpenAI API 代理服务
2
 
3
  ![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)
4
  ![Python: 3.8+](https://img.shields.io/badge/python-3.8+-green.svg)
5
  ![FastAPI](https://img.shields.io/badge/framework-FastAPI-009688.svg)
6
 
7
- 为 Z.AI 提供 OpenAI API 兼容接口的轻量级代理服务,支持 GLM-4.5 系列模型的完整功能。
8
 
9
  ## ✨ 核心特性
10
 
11
  - 🔌 **完全兼容 OpenAI API** - 无缝集成现有应用
 
12
  - 🚀 **高性能流式响应** - Server-Sent Events (SSE) 支持
13
  - 🛠️ **Function Call 支持** - 完整的工具调用功能
14
  - 🧠 **思考模式支持** - 智能处理模型推理过程
@@ -45,6 +46,8 @@ python main.py
45
 
46
  ### 基础使用
47
 
 
 
48
  ```python
49
  import openai
50
 
@@ -64,6 +67,29 @@ response = client.chat.completions.create(
64
  print(response.choices[0].message.content)
65
  ```
66
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
  ### Docker 部署
68
 
69
  ```bash
@@ -134,7 +160,7 @@ for chunk in response:
134
 
135
  | 变量名 | 默认值 | 说明 |
136
  |--------|--------|------|
137
- | `AUTH_TOKEN` | `sk-your-api-key` | 客户端认证密钥 |
138
  | `API_ENDPOINT` | `https://chat.z.ai/api/chat/completions` | 上游 API 地址 |
139
  | `LISTEN_PORT` | `8080` | 服务监听端口 |
140
  | `DEBUG_LOGGING` | `true` | 调试日志开关 |
@@ -205,7 +231,10 @@ if response.choices[0].message.tool_calls:
205
  ## ❓ 常见问题
206
 
207
  **Q: 如何获取 AUTH_TOKEN?**
208
- A: `AUTH_TOKEN` 为自己自定义的api key,在 `main.py` 的 `ServerConfig` 类中或通过环境变量配置,需要保证客户端与服务端一致。
 
 
 
209
 
210
  **Q: 匿名模式是什么?**
211
  A: 匿名模式使用临时 token,避免对话历史共享,保护隐私。
@@ -216,17 +245,23 @@ A: 通过智能提示注入实现,将工具定义转换为系统提示。
216
  **Q: 支持哪些 OpenAI 功能?**
217
  A: 支持聊天完成、模型列表、流式响应、工具调用等核心功能。
218
 
 
 
 
219
  **Q: 如何自定义配置?**
220
- A: 通过环境变量或修改 `main.py` 中的 `ServerConfig` 类。
221
 
222
  ## 🏗️ 技术架构
223
 
224
  ```
225
- ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
226
- OpenAI │ │ Proxy │ │ Z.AI
227
- │ Client │────▶│ Server │────▶│ API
228
- │ │ │ │
229
- └─────────────┘ └─────────────┘ └─────────────┘
 
 
 
230
  ```
231
 
232
  - **FastAPI** - 高性能 Web 框架
 
1
+ # Z.AI OpenAI & Anthropic API 代理服务
2
 
3
  ![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)
4
  ![Python: 3.8+](https://img.shields.io/badge/python-3.8+-green.svg)
5
  ![FastAPI](https://img.shields.io/badge/framework-FastAPI-009688.svg)
6
 
7
+ 为 Z.AI 提供 OpenAI 和 Anthropic API 兼容接口的轻量级代理服务,支持 GLM-4.5 系列模型的完整功能。
8
 
9
  ## ✨ 核心特性
10
 
11
  - 🔌 **完全兼容 OpenAI API** - 无缝集成现有应用
12
+ - 🎭 **兼容 Anthropic API** - 支持 Claude CLI 客户端直接接入
13
  - 🚀 **高性能流式响应** - Server-Sent Events (SSE) 支持
14
  - 🛠️ **Function Call 支持** - 完整的工具调用功能
15
  - 🧠 **思考模式支持** - 智能处理模型推理过程
 
46
 
47
  ### 基础使用
48
 
49
+ #### OpenAI API 客户端
50
+
51
  ```python
52
  import openai
53
 
 
67
  print(response.choices[0].message.content)
68
  ```
69
 
70
+ #### Anthropic API 客户端
71
+
72
+ ```python
73
+ import anthropic
74
+
75
+ # 初始化客户端
76
+ client = anthropic.Anthropic(
77
+ base_url="http://localhost:8080/v1",
78
+ api_key="your-anthropic-token" # 替换为你的 ANTHROPIC_API_KEY
79
+ )
80
+
81
+ # 普通对话
82
+ message = client.messages.create(
83
+ model="GLM-4.5",
84
+ max_tokens=1024,
85
+ messages=[
86
+ {"role": "user", "content": "你好,介绍一下 Python"}
87
+ ]
88
+ )
89
+
90
+ print(message.content[0].text)
91
+ ```
92
+
93
  ### Docker 部署
94
 
95
  ```bash
 
160
 
161
  | 变量名 | 默认值 | 说明 |
162
  |--------|--------|------|
163
+ | `AUTH_TOKEN` | `sk-your-api-key` | 客户端认证密钥(OpenAI 和 Anthropic 共用) |
164
  | `API_ENDPOINT` | `https://chat.z.ai/api/chat/completions` | 上游 API 地址 |
165
  | `LISTEN_PORT` | `8080` | 服务监听端口 |
166
  | `DEBUG_LOGGING` | `true` | 调试日志开关 |
 
231
  ## ❓ 常见问题
232
 
233
  **Q: 如何获取 AUTH_TOKEN?**
234
+ A: `AUTH_TOKEN` 为自己自定义的api key,在环境变量中配置,需要保证客户端与服务端一致。
235
+
236
+ **Q: ANTHROPIC_API_KEY 如何配置?**
237
+ A: 默认使用 `AUTH_TOKEN` 的值,两个 API 使用相同的认证密钥。如需分开配置,可单独设置 `ANTHROPIC_API_KEY` 环境变量。
238
 
239
  **Q: 匿名模式是什么?**
240
  A: 匿名模式使用临时 token,避免对话历史共享,保护隐私。
 
245
  **Q: 支持哪些 OpenAI 功能?**
246
  A: 支持聊天完成、模型列表、流式响应、工具调用等核心功能。
247
 
248
+ **Q: 支持 Anthropic API 的哪些功能?**
249
+ A: 支持 messages 创建、流式响应、系统提示等核心功能。
250
+
251
  **Q: 如何自定义配置?**
252
+ A: 通过环境变量配置,推荐使用 `.env` 文件。
253
 
254
  ## 🏗️ 技术架构
255
 
256
  ```
257
+ ┌──────────────┐ ┌─────────────┐ ┌─────────────┐
258
+ OpenAI │ │ │ │
259
+ │ Client │────▶│ Proxy │────▶│ Z.AI
260
+ └──────────────┘ Server │ │ API
261
+ ┌──────────────┐ │ │ │ │
262
+ │ Anthropic │────▶│ │ │ │
263
+ │ Client │ │ │ │ │
264
+ └──────────────┘ └─────────────┘ └─────────────┘
265
  ```
266
 
267
  - **FastAPI** - 高性能 Web 框架
app/__init__.py ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ """
2
+ Application package initialization
3
+ """
4
+
5
+ from app import api, core, models, utils
6
+
7
+ __all__ = ["api", "core", "models", "utils"]
app/api/__init__.py ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ """
2
+ API module initialization
3
+ """
4
+
5
+ from app.api import openai, anthropic
6
+
7
+ __all__ = ["openai", "anthropic"]
app/api/anthropic.py ADDED
@@ -0,0 +1,273 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Anthropic API compatibility endpoints
3
+ """
4
+
5
+ import json
6
+ import time
7
+ import uuid
8
+ from typing import Generator
9
+ import requests
10
+ from fastapi import APIRouter, Header, HTTPException
11
+ from fastapi.responses import StreamingResponse
12
+
13
+ from app.core.config import settings
14
+ from app.models.schemas import (
15
+ AnthropicRequest, Message, UpstreamRequest, ModelItem,
16
+ ContentBlock
17
+ )
18
+ from app.utils.helpers import debug_log, generate_request_ids, get_auth_token, get_browser_headers, transform_thinking_content
19
+
20
+ router = APIRouter()
21
+
22
+
23
+ def stream_anthropic_generator(upstream_response: requests.Response, request_id: str, requested_model: str) -> Generator[str, None, None]:
24
+ """生成 Anthropic 兼容的流式响应事件"""
25
+ usage = {"input_tokens": 0, "output_tokens": 0}
26
+
27
+ start_event = {
28
+ "type": "message_start",
29
+ "message": {
30
+ "id": request_id,
31
+ "type": "message",
32
+ "role": "assistant",
33
+ "content": [],
34
+ "model": requested_model,
35
+ "stop_reason": None,
36
+ "stop_sequence": None,
37
+ "usage": usage
38
+ }
39
+ }
40
+ yield f"event: {start_event['type']}\ndata: {json.dumps(start_event['message'])}\n\n"
41
+
42
+ # 发送 content_block_start 事件
43
+ content_start_data = {
44
+ "type": "content_block_start",
45
+ "index": 0,
46
+ "content_block": {
47
+ "type": "text",
48
+ "text": ""
49
+ }
50
+ }
51
+ yield f"event: content_block_start\ndata: {json.dumps(content_start_data)}\n\n"
52
+
53
+ # 处理上游响应
54
+ for line in upstream_response.iter_lines():
55
+ if not line.startswith(b"data:"): continue
56
+ data_str = line[5:].strip()
57
+ if not data_str: continue
58
+ try:
59
+ data = json.loads(data_str.decode('utf-8'))
60
+ delta_content = data.get("data", {}).get("delta_content", "")
61
+ phase = data.get("data", {}).get("phase", "")
62
+
63
+ # 处理内容增量
64
+ if delta_content:
65
+ out_content = transform_thinking_content(delta_content) if phase == "thinking" else delta_content
66
+ if out_content:
67
+ usage["output_tokens"] += len(out_content) // 4 # 简单估算
68
+ delta_data = {
69
+ "type": "content_block_delta",
70
+ "index": 0,
71
+ "delta": {
72
+ "type": "text_delta",
73
+ "text": out_content
74
+ }
75
+ }
76
+ yield f"event: content_block_delta\ndata: {json.dumps(delta_data)}\n\n"
77
+
78
+ # 处理结束
79
+ if data.get("data", {}).get("done", False) or phase == "done":
80
+ # 发送 content_block_stop
81
+ content_stop_data = {
82
+ "type": "content_block_stop",
83
+ "index": 0
84
+ }
85
+ yield f"event: content_block_stop\ndata: {json.dumps(content_stop_data)}\n\n"
86
+
87
+ # 发送 message_delta
88
+ message_delta_data = {
89
+ "type": "message_delta",
90
+ "delta": {
91
+ "stop_reason": "end_turn",
92
+ "stop_sequence": None,
93
+ "usage": {
94
+ "input_tokens": usage["input_tokens"],
95
+ "output_tokens": usage["output_tokens"]
96
+ }
97
+ }
98
+ }
99
+ yield f"event: message_delta\ndata: {json.dumps(message_delta_data)}\n\n"
100
+
101
+ # 发送 message_stop
102
+ yield f"event: message_stop\ndata: {json.dumps({'type': 'message_stop'})}\n\n"
103
+ break
104
+
105
+ except json.JSONDecodeError:
106
+ continue
107
+
108
+
109
+ @router.post("/v1/messages")
110
+ async def handle_anthropic_message(
111
+ req: AnthropicRequest,
112
+ x_api_key: str = Header(None, alias="x-api-key"),
113
+ authorization: str = Header(None, alias="authorization")
114
+ ):
115
+ """Handle Anthropic message requests"""
116
+ debug_log("收到 Anthropic message 请求")
117
+
118
+ # 验证 API key
119
+ api_key = None
120
+ if x_api_key:
121
+ api_key = x_api_key
122
+ elif authorization and authorization.startswith("Bearer "):
123
+ api_key = authorization[7:]
124
+
125
+ if not api_key or api_key != settings.ANTHROPIC_API_KEY:
126
+ debug_log(f"无效的 API key: {api_key}")
127
+ raise HTTPException(status_code=401, detail="Invalid API key")
128
+
129
+ debug_log(f"API key 验证通过")
130
+ debug_log(f"请求解析成功 - 模型: {req.model}, 流式: {req.stream}, 消息数: {len(req.messages)}")
131
+
132
+ # 确定上游模型和功能
133
+ upstream_model = "GLM-4.5"
134
+ if req.model == settings.THINKING_MODEL:
135
+ upstream_model = "GLM-4.5-Thinking"
136
+ elif req.model == settings.SEARCH_MODEL:
137
+ upstream_model = "GLM-4.5-Search"
138
+
139
+ debug_log(f"收到请求 (模型: {req.model}) -> 代理到上游 (模型: {upstream_model})")
140
+
141
+ # 生成 ID
142
+ chat_id, msg_id = generate_request_ids()
143
+
144
+ # 转换消息格式
145
+ openai_messages = []
146
+ if req.system:
147
+ # 处理两种格式的 system 内容
148
+ if isinstance(req.system, str):
149
+ # 字符串格式
150
+ system_content = req.system
151
+ else:
152
+ # 对象数组格式
153
+ system_content = ""
154
+ for block in req.system:
155
+ if block.type == "text":
156
+ system_content += block.text
157
+
158
+ openai_messages.append({"role": "system", "content": system_content})
159
+
160
+ for msg in req.messages:
161
+ # 处理两种格式的内容
162
+ if isinstance(msg.content, str):
163
+ # 字符串格式
164
+ text_content = msg.content
165
+ else:
166
+ # 对象数组格式
167
+ text_content = ""
168
+ for block in msg.content:
169
+ if block.type == "text":
170
+ text_content += block.text
171
+
172
+ openai_messages.append({
173
+ "role": msg.role,
174
+ "content": text_content
175
+ })
176
+
177
+ # 构建上游请求
178
+ upstream_messages = []
179
+ for msg in openai_messages:
180
+ content = msg.get("content", "")
181
+ if content is None:
182
+ content = ""
183
+ upstream_messages.append(Message(
184
+ role=msg["role"],
185
+ content=content
186
+ ))
187
+
188
+ upstream_req = UpstreamRequest(
189
+ stream=True, # 总是使用上游的流式
190
+ chat_id=chat_id,
191
+ id=msg_id,
192
+ model="0727-360B-API", # 实际的上游模型 ID
193
+ messages=upstream_messages,
194
+ params={},
195
+ features={"enable_thinking": True},
196
+ background_tasks={
197
+ "title_generation": False,
198
+ "tags_generation": False,
199
+ },
200
+ mcp_servers=[],
201
+ model_item=ModelItem(
202
+ id="0727-360B-API",
203
+ name="GLM-4.5",
204
+ owned_by="openai"
205
+ ),
206
+ tool_servers=[],
207
+ variables={
208
+ "{{USER_NAME}}": "User",
209
+ "{{USER_LOCATION}}": "Unknown",
210
+ "{{CURRENT_DATETIME}}": time.strftime("%Y-%m-%d %H:%M:%S"),
211
+ }
212
+ )
213
+
214
+ # 获取认证 token
215
+ auth_token = get_auth_token()
216
+
217
+ try:
218
+ # 调用上游 API
219
+ headers = get_browser_headers(chat_id)
220
+ headers["Authorization"] = f"Bearer {auth_token}"
221
+
222
+ response = requests.post(
223
+ settings.API_ENDPOINT,
224
+ json=upstream_req.model_dump(exclude_none=True),
225
+ headers=headers,
226
+ timeout=60.0,
227
+ stream=True
228
+ )
229
+ response.raise_for_status()
230
+ except requests.HTTPError as e:
231
+ debug_log(f"上游 API 返回错误状态: {e.response.status_code}, 响应: {e.response.text}")
232
+ raise HTTPException(status_code=502, detail="Upstream API error")
233
+ except requests.RequestException as e:
234
+ debug_log(f"请求上游 API 失败: {e}")
235
+ raise HTTPException(status_code=502, detail=f"Failed to call upstream API: {e}")
236
+
237
+ request_id = f"msg_{uuid.uuid4().hex}"
238
+
239
+ if req.stream:
240
+ # 流式响应
241
+ return StreamingResponse(
242
+ stream_anthropic_generator(response, request_id, req.model),
243
+ media_type="text/event-stream",
244
+ headers={"Cache-Control": "no-cache", "Connection": "keep-alive"}
245
+ )
246
+ else:
247
+ # 非流式响应
248
+ full_content = ""
249
+ for line in response.iter_lines():
250
+ if not line.startswith(b"data:"): continue
251
+ data_str = line[5:].strip()
252
+ if not data_str: continue
253
+ try:
254
+ data = json.loads(data_str.decode('utf-8'))
255
+ delta_content = data.get("data", {}).get("delta_content", "")
256
+ phase = data.get("data", {}).get("phase", "")
257
+ if delta_content:
258
+ out_content = transform_thinking_content(delta_content) if phase == "thinking" else delta_content
259
+ if out_content: full_content += out_content
260
+ if data.get("data", {}).get("done", False) or phase == "done":
261
+ break
262
+ except json.JSONDecodeError:
263
+ continue
264
+
265
+ return {
266
+ "id": request_id,
267
+ "type": "message",
268
+ "role": "assistant",
269
+ "model": req.model,
270
+ "content": [{"type": "text", "text": full_content}],
271
+ "stop_reason": "end_turn",
272
+ "usage": {"input_tokens": 0, "output_tokens": len(full_content) // 4}
273
+ }
app/api/openai.py ADDED
@@ -0,0 +1,161 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ OpenAI API endpoints
3
+ """
4
+
5
+ import time
6
+ from datetime import datetime
7
+ from typing import List
8
+ from fastapi import APIRouter, Header, HTTPException
9
+ from fastapi.responses import StreamingResponse
10
+
11
+ from app.core.config import settings
12
+ from app.models.schemas import (
13
+ OpenAIRequest, Message, UpstreamRequest, ModelItem,
14
+ ModelsResponse, Model
15
+ )
16
+ from app.utils.helpers import debug_log, generate_request_ids, get_auth_token
17
+ from app.utils.tools import process_messages_with_tools
18
+ from app.core.response_handlers import StreamResponseHandler, NonStreamResponseHandler
19
+
20
+ router = APIRouter()
21
+
22
+
23
+ @router.get("/v1/models")
24
+ async def list_models():
25
+ """List available models"""
26
+ current_time = int(time.time())
27
+ response = ModelsResponse(
28
+ data=[
29
+ Model(
30
+ id=settings.PRIMARY_MODEL,
31
+ created=current_time,
32
+ owned_by="z.ai"
33
+ ),
34
+ Model(
35
+ id=settings.THINKING_MODEL,
36
+ created=current_time,
37
+ owned_by="z.ai"
38
+ ),
39
+ Model(
40
+ id=settings.SEARCH_MODEL,
41
+ created=current_time,
42
+ owned_by="z.ai"
43
+ ),
44
+ ]
45
+ )
46
+ return response
47
+
48
+
49
+ @router.post("/v1/chat/completions")
50
+ async def chat_completions(
51
+ request: OpenAIRequest,
52
+ authorization: str = Header(...)
53
+ ):
54
+ """Handle chat completion requests"""
55
+ debug_log("收到chat completions请求")
56
+
57
+ try:
58
+ # Validate API key
59
+ if not authorization.startswith("Bearer "):
60
+ debug_log("缺少或无效的Authorization头")
61
+ raise HTTPException(status_code=401, detail="Missing or invalid Authorization header")
62
+
63
+ api_key = authorization[7:]
64
+ if api_key != settings.AUTH_TOKEN:
65
+ debug_log(f"无效的API key: {api_key}")
66
+ raise HTTPException(status_code=401, detail="Invalid API key")
67
+
68
+ debug_log(f"API key验证通过,AUTH_TOKEN={api_key[:8]}......")
69
+ debug_log(f"请求解析成功 - 模型: {request.model}, 流式: {request.stream}, 消息数: {len(request.messages)}")
70
+
71
+ # Generate IDs
72
+ chat_id, msg_id = generate_request_ids()
73
+
74
+ # Process messages with tools
75
+ processed_messages = process_messages_with_tools(
76
+ [m.model_dump() for m in request.messages],
77
+ request.tools,
78
+ request.tool_choice
79
+ )
80
+
81
+ # Convert back to Message objects
82
+ upstream_messages: List[Message] = []
83
+ for msg in processed_messages:
84
+ content = msg.get("content")
85
+ # Ensure content is not None for Message model
86
+ if content is None:
87
+ content = ""
88
+
89
+ upstream_messages.append(Message(
90
+ role=msg["role"],
91
+ content=content,
92
+ reasoning_content=msg.get("reasoning_content")
93
+ ))
94
+
95
+ # Determine model features
96
+ is_thinking = request.model == settings.THINKING_MODEL
97
+ is_search = request.model == settings.SEARCH_MODEL
98
+ search_mcp = "deep-web-search" if is_search else ""
99
+
100
+ # Build upstream request
101
+ upstream_req = UpstreamRequest(
102
+ stream=True, # Always use streaming from upstream
103
+ chat_id=chat_id,
104
+ id=msg_id,
105
+ model="0727-360B-API", # Actual upstream model ID
106
+ messages=upstream_messages,
107
+ params={},
108
+ features={
109
+ "enable_thinking": is_thinking,
110
+ "web_search": is_search,
111
+ "auto_web_search": is_search,
112
+ },
113
+ background_tasks={
114
+ "title_generation": False,
115
+ "tags_generation": False,
116
+ },
117
+ mcp_servers=[search_mcp] if search_mcp else [],
118
+ model_item=ModelItem(
119
+ id="0727-360B-API",
120
+ name="GLM-4.5",
121
+ owned_by="openai"
122
+ ),
123
+ tool_servers=[],
124
+ variables={
125
+ "{{USER_NAME}}": "User",
126
+ "{{USER_LOCATION}}": "Unknown",
127
+ "{{CURRENT_DATETIME}}": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
128
+ }
129
+ )
130
+
131
+ # Get authentication token
132
+ auth_token = get_auth_token()
133
+
134
+ # Check if tools are enabled and present
135
+ has_tools = (settings.TOOL_SUPPORT and
136
+ request.tools and
137
+ len(request.tools) > 0 and
138
+ request.tool_choice != "none")
139
+
140
+ # Handle response based on stream flag
141
+ if request.stream:
142
+ handler = StreamResponseHandler(upstream_req, chat_id, auth_token, has_tools)
143
+ return StreamingResponse(
144
+ handler.handle(),
145
+ media_type="text/event-stream",
146
+ headers={
147
+ "Cache-Control": "no-cache",
148
+ "Connection": "keep-alive",
149
+ }
150
+ )
151
+ else:
152
+ handler = NonStreamResponseHandler(upstream_req, chat_id, auth_token, has_tools)
153
+ return handler.handle()
154
+
155
+ except HTTPException:
156
+ raise
157
+ except Exception as e:
158
+ debug_log(f"处理请求时发生错误: {str(e)}")
159
+ import traceback
160
+ debug_log(f"错误堆栈: {traceback.format_exc()}")
161
+ raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")
app/core/__init__.py ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ """
2
+ Core module initialization
3
+ """
4
+
5
+ from app.core import config, response_handlers
6
+
7
+ __all__ = ["config", "response_handlers"]
app/core/config.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ FastAPI application configuration module
3
+ """
4
+
5
+ import os
6
+ from typing import Dict, Optional
7
+ from pydantic_settings import BaseSettings
8
+
9
+
10
+ class Settings(BaseSettings):
11
+ """Application settings"""
12
+
13
+ # API Configuration
14
+ API_ENDPOINT: str = os.getenv("API_ENDPOINT", "https://chat.z.ai/api/chat/completions")
15
+ AUTH_TOKEN: str = os.getenv("AUTH_TOKEN", "sk-your-api-key")
16
+ ANTHROPIC_API_KEY: str = os.getenv("ANTHROPIC_API_KEY", AUTH_TOKEN)
17
+ BACKUP_TOKEN: str = os.getenv("BACKUP_TOKEN", "eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6IjMxNmJjYjQ4LWZmMmYtNGExNS04NTNkLWYyYTI5YjY3ZmYwZiIsImVtYWlsIjoiR3Vlc3QtMTc1NTg0ODU4ODc4OEBndWVzdC5jb20ifQ.PktllDySS3trlyuFpTeIZf-7hl8Qu1qYF3BxjgIul0BrNux2nX9hVzIjthLXKMWAf9V0qM8Vm_iyDqkjPGsaiQ")
18
+
19
+ # Model Configuration
20
+ PRIMARY_MODEL: str = os.getenv("PRIMARY_MODEL", "GLM-4.5")
21
+ THINKING_MODEL: str = os.getenv("THINKING_MODEL", "GLM-4.5-Thinking")
22
+ SEARCH_MODEL: str = os.getenv("SEARCH_MODEL", "GLM-4.5-Search")
23
+
24
+ # Server Configuration
25
+ LISTEN_PORT: int = int(os.getenv("LISTEN_PORT", "8080"))
26
+ DEBUG_LOGGING: bool = os.getenv("DEBUG_LOGGING", "true").lower() == "true"
27
+
28
+ # Feature Configuration
29
+ THINKING_PROCESSING: str = os.getenv("THINKING_PROCESSING", "think") # strip: 去除<details>标签;think: 转为<span>标签;raw: 保留原样
30
+ ANONYMOUS_MODE: bool = os.getenv("ANONYMOUS_MODE", "true").lower() == "true"
31
+ TOOL_SUPPORT: bool = os.getenv("TOOL_SUPPORT", "true").lower() == "true"
32
+ SCAN_LIMIT: int = int(os.getenv("SCAN_LIMIT", "200000"))
33
+
34
+ # Browser Headers
35
+ CLIENT_HEADERS: Dict[str, str] = {
36
+ "Content-Type": "application/json",
37
+ "Accept": "application/json, text/event-stream",
38
+ "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/139.0.0.0 Safari/537.36 Edg/139.0.0.0",
39
+ "Accept-Language": "zh-CN",
40
+ "sec-ch-ua": '"Not;A=Brand";v="99", "Microsoft Edge";v="139", "Chromium";v="139"',
41
+ "sec-ch-ua-mobile": "?0",
42
+ "sec-ch-ua-platform": '"Windows"',
43
+ "X-FE-Version": "prod-fe-1.0.70",
44
+ "Origin": "https://chat.z.ai",
45
+ }
46
+
47
+ class Config:
48
+ env_file = ".env"
49
+
50
+
51
+ settings = Settings()
app/core/response_handlers.py ADDED
@@ -0,0 +1,331 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Response handlers for streaming and non-streaming responses
3
+ """
4
+
5
+ import json
6
+ import time
7
+ from typing import Generator, Optional
8
+ import requests
9
+ from fastapi import HTTPException
10
+ from fastapi.responses import JSONResponse, StreamingResponse
11
+
12
+ from app.core.config import settings
13
+ from app.models.schemas import (
14
+ Message, Delta, Choice, Usage, OpenAIResponse,
15
+ UpstreamRequest, UpstreamData, UpstreamError, ModelItem
16
+ )
17
+ from app.utils.helpers import debug_log, call_upstream_api, transform_thinking_content
18
+ from app.utils.sse_parser import SSEParser
19
+ from app.utils.tools import extract_tool_invocations, remove_tool_json_content
20
+
21
+
22
+ def create_openai_response_chunk(
23
+ model: str,
24
+ delta: Optional[Delta] = None,
25
+ finish_reason: Optional[str] = None
26
+ ) -> OpenAIResponse:
27
+ """Create OpenAI response chunk for streaming"""
28
+ return OpenAIResponse(
29
+ id=f"chatcmpl-{int(time.time())}",
30
+ object="chat.completion.chunk",
31
+ created=int(time.time()),
32
+ model=model,
33
+ choices=[Choice(
34
+ index=0,
35
+ delta=delta or Delta(),
36
+ finish_reason=finish_reason
37
+ )]
38
+ )
39
+
40
+
41
+ def handle_upstream_error(error: UpstreamError) -> Generator[str, None, None]:
42
+ """Handle upstream error response"""
43
+ debug_log(f"上游错误: code={error.code}, detail={error.detail}")
44
+
45
+ # Send end chunk
46
+ end_chunk = create_openai_response_chunk(
47
+ model=settings.PRIMARY_MODEL,
48
+ finish_reason="stop"
49
+ )
50
+ yield f"data: {end_chunk.model_dump_json()}\n\n"
51
+ yield "data: [DONE]\n\n"
52
+
53
+
54
+ class ResponseHandler:
55
+ """Base class for response handling"""
56
+
57
+ def __init__(self, upstream_req: UpstreamRequest, chat_id: str, auth_token: str):
58
+ self.upstream_req = upstream_req
59
+ self.chat_id = chat_id
60
+ self.auth_token = auth_token
61
+
62
+ def _call_upstream(self) -> requests.Response:
63
+ """Call upstream API with error handling"""
64
+ try:
65
+ return call_upstream_api(self.upstream_req, self.chat_id, self.auth_token)
66
+ except Exception as e:
67
+ debug_log(f"调用上游失败: {e}")
68
+ raise
69
+
70
+ def _handle_upstream_error(self, response: requests.Response) -> None:
71
+ """Handle upstream error response"""
72
+ debug_log(f"上游返回错误状态: {response.status_code}")
73
+ if settings.DEBUG_LOGGING:
74
+ debug_log(f"上游错误响应: {response.text}")
75
+
76
+
77
+ class StreamResponseHandler(ResponseHandler):
78
+ """Handler for streaming responses"""
79
+
80
+ def __init__(self, upstream_req: UpstreamRequest, chat_id: str, auth_token: str, has_tools: bool = False):
81
+ super().__init__(upstream_req, chat_id, auth_token)
82
+ self.has_tools = has_tools
83
+ self.buffered_content = ""
84
+ self.tool_calls = None
85
+
86
+ def handle(self) -> Generator[str, None, None]:
87
+ """Handle streaming response"""
88
+ debug_log(f"开始处理流式响应 (chat_id={self.chat_id})")
89
+
90
+ try:
91
+ response = self._call_upstream()
92
+ except Exception:
93
+ yield "data: {\"error\": \"Failed to call upstream\"}\n\n"
94
+ return
95
+
96
+ if response.status_code != 200:
97
+ self._handle_upstream_error(response)
98
+ yield "data: {\"error\": \"Upstream error\"}\n\n"
99
+ return
100
+
101
+ # Send initial role chunk
102
+ first_chunk = create_openai_response_chunk(
103
+ model=settings.PRIMARY_MODEL,
104
+ delta=Delta(role="assistant")
105
+ )
106
+ yield f"data: {first_chunk.model_dump_json()}\n\n"
107
+
108
+ # Process stream
109
+ debug_log("开始读取上游SSE流")
110
+ sent_initial_answer = False
111
+
112
+ with SSEParser(response, debug_mode=settings.DEBUG_LOGGING) as parser:
113
+ for event in parser.iter_json_data(UpstreamData):
114
+ upstream_data = event['data']
115
+
116
+ # Check for errors
117
+ if self._has_error(upstream_data):
118
+ error = self._get_error(upstream_data)
119
+ yield from handle_upstream_error(error)
120
+ break
121
+
122
+ debug_log(f"解析成功 - 类型: {upstream_data.type}, 阶段: {upstream_data.data.phase}, "
123
+ f"内容长度: {len(upstream_data.data.delta_content)}, 完成: {upstream_data.data.done}")
124
+
125
+ # Process content
126
+ yield from self._process_content(upstream_data, sent_initial_answer)
127
+
128
+ # Check if done
129
+ if upstream_data.data.done or upstream_data.data.phase == "done":
130
+ debug_log("检测到流结束信号")
131
+ yield from self._send_end_chunk()
132
+ break
133
+
134
+ def _has_error(self, upstream_data: UpstreamData) -> bool:
135
+ """Check if upstream data contains error"""
136
+ return bool(
137
+ upstream_data.error or
138
+ upstream_data.data.error or
139
+ (upstream_data.data.inner and upstream_data.data.inner.error)
140
+ )
141
+
142
+ def _get_error(self, upstream_data: UpstreamData) -> UpstreamError:
143
+ """Get error from upstream data"""
144
+ return (
145
+ upstream_data.error or
146
+ upstream_data.data.error or
147
+ (upstream_data.data.inner.error if upstream_data.data.inner else None)
148
+ )
149
+
150
+ def _process_content(
151
+ self,
152
+ upstream_data: UpstreamData,
153
+ sent_initial_answer: bool
154
+ ) -> Generator[str, None, None]:
155
+ """Process content from upstream data"""
156
+ content = upstream_data.data.delta_content or upstream_data.data.edit_content
157
+
158
+ if not content:
159
+ return
160
+
161
+ # Transform thinking content
162
+ if upstream_data.data.phase == "thinking":
163
+ content = transform_thinking_content(content)
164
+
165
+ # Buffer content if tools are enabled
166
+ if self.has_tools:
167
+ self.buffered_content += content
168
+ else:
169
+ # Handle initial answer content
170
+ if (not sent_initial_answer and
171
+ upstream_data.data.edit_content and
172
+ upstream_data.data.phase == "answer"):
173
+
174
+ content = self._extract_edit_content(upstream_data.data.edit_content)
175
+ if content:
176
+ debug_log(f"发送普通内容: {content}")
177
+ chunk = create_openai_response_chunk(
178
+ model=settings.PRIMARY_MODEL,
179
+ delta=Delta(content=content)
180
+ )
181
+ yield f"data: {chunk.model_dump_json()}\n\n"
182
+ sent_initial_answer = True
183
+
184
+ # Handle delta content
185
+ if upstream_data.data.delta_content:
186
+ if content:
187
+ if upstream_data.data.phase == "thinking":
188
+ debug_log(f"发送思考内容: {content}")
189
+ chunk = create_openai_response_chunk(
190
+ model=settings.PRIMARY_MODEL,
191
+ delta=Delta(reasoning_content=content)
192
+ )
193
+ else:
194
+ debug_log(f"发送普通内容: {content}")
195
+ chunk = create_openai_response_chunk(
196
+ model=settings.PRIMARY_MODEL,
197
+ delta=Delta(content=content)
198
+ )
199
+ yield f"data: {chunk.model_dump_json()}\n\n"
200
+
201
+ def _extract_edit_content(self, edit_content: str) -> str:
202
+ """Extract content from edit_content field"""
203
+ parts = edit_content.split("</details>")
204
+ return parts[1] if len(parts) > 1 else ""
205
+
206
+ def _send_end_chunk(self) -> Generator[str, None, None]:
207
+ """Send end chunk and DONE signal"""
208
+ if self.has_tools:
209
+ # Try to extract tool calls from buffered content
210
+ self.tool_calls = extract_tool_invocations(self.buffered_content)
211
+
212
+ if self.tool_calls:
213
+ # Send tool calls
214
+ tool_calls_list = []
215
+ for i, tc in enumerate(self.tool_calls):
216
+ tool_calls_list.append({
217
+ "index": i,
218
+ "id": tc.get("id"),
219
+ "type": tc.get("type", "function"),
220
+ "function": tc.get("function", {}),
221
+ })
222
+
223
+ out_chunk = create_openai_response_chunk(
224
+ model=settings.PRIMARY_MODEL,
225
+ delta=Delta(tool_calls=tool_calls_list)
226
+ )
227
+ yield f"data: {out_chunk.model_dump_json()}\n\n"
228
+ finish_reason = "tool_calls"
229
+ else:
230
+ # Send regular content
231
+ trimmed_content = remove_tool_json_content(self.buffered_content)
232
+ if trimmed_content:
233
+ content_chunk = create_openai_response_chunk(
234
+ model=settings.PRIMARY_MODEL,
235
+ delta=Delta(content=trimmed_content)
236
+ )
237
+ yield f"data: {content_chunk.model_dump_json()}\n\n"
238
+ finish_reason = "stop"
239
+ else:
240
+ finish_reason = "stop"
241
+
242
+ # Send final chunk
243
+ end_chunk = create_openai_response_chunk(
244
+ model=settings.PRIMARY_MODEL,
245
+ finish_reason=finish_reason
246
+ )
247
+ yield f"data: {end_chunk.model_dump_json()}\n\n"
248
+ yield "data: [DONE]\n\n"
249
+ debug_log("流式响应完成")
250
+
251
+
252
+ class NonStreamResponseHandler(ResponseHandler):
253
+ """Handler for non-streaming responses"""
254
+
255
+ def __init__(self, upstream_req: UpstreamRequest, chat_id: str, auth_token: str, has_tools: bool = False):
256
+ super().__init__(upstream_req, chat_id, auth_token)
257
+ self.has_tools = has_tools
258
+
259
+ def handle(self) -> JSONResponse:
260
+ """Handle non-streaming response"""
261
+ debug_log(f"开始处理非流式响应 (chat_id={self.chat_id})")
262
+
263
+ try:
264
+ response = self._call_upstream()
265
+ except Exception as e:
266
+ debug_log(f"调用上游失败: {e}")
267
+ raise HTTPException(status_code=502, detail="Failed to call upstream")
268
+
269
+ if response.status_code != 200:
270
+ self._handle_upstream_error(response)
271
+ raise HTTPException(status_code=502, detail="Upstream error")
272
+
273
+ # Collect full response
274
+ full_content = []
275
+ debug_log("开始收集完整响应内容")
276
+
277
+ with SSEParser(response, debug_mode=settings.DEBUG_LOGGING) as parser:
278
+ for event in parser.iter_json_data(UpstreamData):
279
+ upstream_data = event['data']
280
+
281
+ if upstream_data.data.delta_content:
282
+ content = upstream_data.data.delta_content
283
+
284
+ if upstream_data.data.phase == "thinking":
285
+ content = transform_thinking_content(content)
286
+
287
+ if content:
288
+ full_content.append(content)
289
+
290
+ if upstream_data.data.done or upstream_data.data.phase == "done":
291
+ debug_log("检测到完成信号,停止收集")
292
+ break
293
+
294
+ final_content = "".join(full_content)
295
+ debug_log(f"内容收集完成,最终长度: {len(final_content)}")
296
+
297
+ # Handle tool calls for non-streaming
298
+ tool_calls = None
299
+ finish_reason = "stop"
300
+ message_content = final_content
301
+
302
+ if self.has_tools:
303
+ tool_calls = extract_tool_invocations(final_content)
304
+ if tool_calls:
305
+ # Content must be null when tool_calls are present (OpenAI spec)
306
+ message_content = None
307
+ finish_reason = "tool_calls"
308
+ else:
309
+ # Remove tool JSON from content
310
+ message_content = remove_tool_json_content(final_content)
311
+
312
+ # Build response
313
+ response_data = OpenAIResponse(
314
+ id=f"chatcmpl-{int(time.time())}",
315
+ object="chat.completion",
316
+ created=int(time.time()),
317
+ model=settings.PRIMARY_MODEL,
318
+ choices=[Choice(
319
+ index=0,
320
+ message=Message(
321
+ role="assistant",
322
+ content=message_content,
323
+ tool_calls=tool_calls
324
+ ),
325
+ finish_reason=finish_reason
326
+ )],
327
+ usage=Usage()
328
+ )
329
+
330
+ debug_log("非流式响应发送完成")
331
+ return JSONResponse(content=response_data.model_dump(exclude_none=True))
app/models/__init__.py ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ """
2
+ Models module initialization
3
+ """
4
+
5
+ from app.models import schemas
6
+
7
+ __all__ = ["schemas"]
app/models/schemas.py ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Application data models
3
+ """
4
+
5
+ from typing import Dict, List, Optional, Any, Union, Literal
6
+ from pydantic import BaseModel
7
+
8
+
9
+ class Message(BaseModel):
10
+ """Chat message model"""
11
+ role: str
12
+ content: Optional[str] = None
13
+ reasoning_content: Optional[str] = None
14
+ tool_calls: Optional[List[Dict[str, Any]]] = None
15
+
16
+
17
+ class OpenAIRequest(BaseModel):
18
+ """OpenAI-compatible request model"""
19
+ model: str
20
+ messages: List[Message]
21
+ stream: Optional[bool] = False
22
+ temperature: Optional[float] = None
23
+ max_tokens: Optional[int] = None
24
+ tools: Optional[List[Dict[str, Any]]] = None
25
+ tool_choice: Optional[Any] = None
26
+
27
+
28
+ class ModelItem(BaseModel):
29
+ """Model information item"""
30
+ id: str
31
+ name: str
32
+ owned_by: str
33
+
34
+
35
+ class UpstreamRequest(BaseModel):
36
+ """Upstream service request model"""
37
+ stream: bool
38
+ model: str
39
+ messages: List[Message]
40
+ params: Dict[str, Any] = {}
41
+ features: Dict[str, Any] = {}
42
+ background_tasks: Optional[Dict[str, bool]] = None
43
+ chat_id: Optional[str] = None
44
+ id: Optional[str] = None
45
+ mcp_servers: Optional[List[str]] = None
46
+ model_item: Optional[ModelItem] = None
47
+ tool_servers: Optional[List[str]] = None
48
+ variables: Optional[Dict[str, str]] = None
49
+ model_config = {'protected_namespaces': ()}
50
+
51
+
52
+ class Delta(BaseModel):
53
+ """Stream delta model"""
54
+ role: Optional[str] = None
55
+ content: Optional[str] = None
56
+ reasoning_content: Optional[str] = None
57
+ tool_calls: Optional[List[Dict[str, Any]]] = None
58
+
59
+
60
+ class Choice(BaseModel):
61
+ """Response choice model"""
62
+ index: int
63
+ message: Optional[Message] = None
64
+ delta: Optional[Delta] = None
65
+ finish_reason: Optional[str] = None
66
+
67
+
68
+ class Usage(BaseModel):
69
+ """Token usage statistics"""
70
+ prompt_tokens: int = 0
71
+ completion_tokens: int = 0
72
+ total_tokens: int = 0
73
+
74
+
75
+ class OpenAIResponse(BaseModel):
76
+ """OpenAI-compatible response model"""
77
+ id: str
78
+ object: str
79
+ created: int
80
+ model: str
81
+ choices: List[Choice]
82
+ usage: Optional[Usage] = None
83
+
84
+
85
+ class UpstreamError(BaseModel):
86
+ """Upstream error model"""
87
+ detail: str
88
+ code: int
89
+
90
+
91
+ class UpstreamDataInner(BaseModel):
92
+ """Inner upstream data model"""
93
+ error: Optional[UpstreamError] = None
94
+
95
+
96
+ class UpstreamDataData(BaseModel):
97
+ """Upstream data content model"""
98
+ delta_content: str = ""
99
+ edit_content: str = ""
100
+ phase: str = ""
101
+ done: bool = False
102
+ usage: Optional[Usage] = None
103
+ error: Optional[UpstreamError] = None
104
+ inner: Optional[UpstreamDataInner] = None
105
+
106
+
107
+ class UpstreamData(BaseModel):
108
+ """Upstream data model"""
109
+ type: str
110
+ data: UpstreamDataData
111
+ error: Optional[UpstreamError] = None
112
+
113
+
114
+ class Model(BaseModel):
115
+ """Model information for listing"""
116
+ id: str
117
+ object: str = "model"
118
+ created: int
119
+ owned_by: str
120
+
121
+
122
+ class ModelsResponse(BaseModel):
123
+ """Models list response model"""
124
+ object: str = "list"
125
+ data: List[Model]
126
+
127
+
128
+ # Anthropic API Models
129
+ class ContentBlock(BaseModel):
130
+ type: str
131
+ text: str
132
+
133
+
134
+ class AnthropicMessage(BaseModel):
135
+ role: Literal["user", "assistant"]
136
+ content: Union[str, List[ContentBlock]]
137
+
138
+
139
+ class AnthropicRequest(BaseModel):
140
+ model: str
141
+ messages: List[AnthropicMessage]
142
+ system: Optional[Union[str, List[ContentBlock]]] = None
143
+ max_tokens: int = 1024
144
+ stream: bool = False
145
+ temperature: Optional[float] = None
app/utils/__init__.py ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ """
2
+ Utils module initialization
3
+ """
4
+
5
+ from app.utils import helpers, sse_parser, tools
6
+
7
+ __all__ = ["helpers", "sse_parser", "tools"]
app/utils/helpers.py ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Utility functions for the application
3
+ """
4
+
5
+ import json
6
+ import re
7
+ import time
8
+ from typing import Dict, List, Optional, Any, Tuple, Generator
9
+ import requests
10
+
11
+ from app.core.config import settings
12
+
13
+
14
+ def debug_log(message: str, *args) -> None:
15
+ """Log debug message if debug mode is enabled"""
16
+ if settings.DEBUG_LOGGING:
17
+ if args:
18
+ print(f"[DEBUG] {message % args}")
19
+ else:
20
+ print(f"[DEBUG] {message}")
21
+
22
+
23
+ def generate_request_ids() -> Tuple[str, str]:
24
+ """Generate unique IDs for chat and message"""
25
+ timestamp = int(time.time())
26
+ chat_id = f"{timestamp * 1000}-{timestamp}"
27
+ msg_id = str(timestamp * 1000000)
28
+ return chat_id, msg_id
29
+
30
+
31
+ def get_browser_headers(referer_chat_id: str = "") -> Dict[str, str]:
32
+ """Get browser headers for API requests"""
33
+ headers = settings.CLIENT_HEADERS.copy()
34
+
35
+ if referer_chat_id:
36
+ headers["Referer"] = f"{settings.CLIENT_HEADERS['Origin']}/c/{referer_chat_id}"
37
+
38
+ return headers
39
+
40
+
41
+ def get_anonymous_token() -> str:
42
+ """Get anonymous token for authentication"""
43
+ headers = get_browser_headers()
44
+ headers.update({
45
+ "Accept": "*/*",
46
+ "Accept-Language": "zh-CN,zh;q=0.9",
47
+ "Referer": f"{settings.CLIENT_HEADERS['Origin']}/",
48
+ })
49
+
50
+ try:
51
+ response = requests.get(
52
+ f"{settings.CLIENT_HEADERS['Origin']}/api/v1/auths/",
53
+ headers=headers,
54
+ timeout=10.0
55
+ )
56
+
57
+ if response.status_code != 200:
58
+ raise Exception(f"anon token status={response.status_code}")
59
+
60
+ data = response.json()
61
+ token = data.get("token")
62
+ if not token:
63
+ raise Exception("anon token empty")
64
+
65
+ return token
66
+ except Exception as e:
67
+ debug_log(f"获取匿名token失败: {e}")
68
+ raise
69
+
70
+
71
+ def get_auth_token() -> str:
72
+ """Get authentication token (anonymous or fixed)"""
73
+ if settings.ANONYMOUS_MODE:
74
+ try:
75
+ token = get_anonymous_token()
76
+ debug_log(f"匿名token获取成功: {token[:10]}...")
77
+ return token
78
+ except Exception as e:
79
+ debug_log(f"匿名token获取失败,回退固定token: {e}")
80
+
81
+ return settings.BACKUP_TOKEN
82
+
83
+
84
+ def transform_thinking_content(content: str) -> str:
85
+ """Transform thinking content according to configuration"""
86
+ # Remove summary tags
87
+ content = re.sub(r'(?s)<summary>.*?</summary>', '', content)
88
+ # Clean up remaining tags
89
+ content = content.replace("</thinking>", "").replace("<Full>", "").replace("</Full>", "")
90
+ content = content.strip()
91
+
92
+ if settings.THINKING_PROCESSING == "think":
93
+ content = re.sub(r'<details[^>]*>', '<span>', content)
94
+ content = content.replace("</details>", "</span>")
95
+ elif settings.THINKING_PROCESSING == "strip":
96
+ content = re.sub(r'<details[^>]*>', '', content)
97
+ content = content.replace("</details>", "")
98
+
99
+ # Remove line prefixes
100
+ content = content.lstrip("> ")
101
+ content = content.replace("\n> ", "\n")
102
+
103
+ return content.strip()
104
+
105
+
106
+ def call_upstream_api(
107
+ upstream_req: Any,
108
+ chat_id: str,
109
+ auth_token: str
110
+ ) -> requests.Response:
111
+ """Call upstream API with proper headers"""
112
+ headers = get_browser_headers(chat_id)
113
+ headers["Authorization"] = f"Bearer {auth_token}"
114
+
115
+ debug_log(f"调用上游API: {settings.API_ENDPOINT}")
116
+ debug_log(f"上游请求体: {upstream_req.model_dump_json()}")
117
+
118
+ response = requests.post(
119
+ settings.API_ENDPOINT,
120
+ json=upstream_req.model_dump(exclude_none=True),
121
+ headers=headers,
122
+ timeout=60.0,
123
+ stream=True
124
+ )
125
+
126
+ debug_log(f"上游响应状态: {response.status_code}")
127
+ return response
app/utils/sse_parser.py ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ SSE (Server-Sent Events) parser for streaming responses
3
+ """
4
+
5
+ import json
6
+ from typing import Dict, Any, Generator, Optional, Type
7
+ import requests
8
+
9
+ from app.core.config import settings
10
+ from app.models.schemas import UpstreamData
11
+
12
+
13
+ class SSEParser:
14
+ """Server-Sent Events parser for streaming responses"""
15
+
16
+ def __init__(self, response: requests.Response, debug_mode: bool = False):
17
+ """Initialize SSE parser
18
+
19
+ Args:
20
+ response: requests.Response object with stream=True
21
+ debug_mode: Enable debug logging
22
+ """
23
+ self.response = response
24
+ self.debug_mode = debug_mode
25
+ self.buffer = ""
26
+ self.line_count = 0
27
+
28
+ def debug_log(self, format_str: str, *args) -> None:
29
+ """Log debug message if debug mode is enabled"""
30
+ if self.debug_mode:
31
+ if args:
32
+ print(f"[SSE_PARSER] {format_str % args}")
33
+ else:
34
+ print(f"[SSE_PARSER] {format_str}")
35
+
36
+ def iter_events(self) -> Generator[Dict[str, Any], None, None]:
37
+ """Iterate over SSE events
38
+
39
+ Yields:
40
+ dict: Parsed SSE event data
41
+ """
42
+ self.debug_log("开始解析 SSE 流")
43
+
44
+ for line in self.response.iter_lines():
45
+ self.line_count += 1
46
+
47
+ # Skip empty lines
48
+ if not line:
49
+ continue
50
+
51
+ # Decode bytes
52
+ if isinstance(line, bytes):
53
+ try:
54
+ line = line.decode('utf-8')
55
+ except UnicodeDecodeError:
56
+ self.debug_log(f"第{self.line_count}行解码失败,跳过")
57
+ continue
58
+
59
+ # Skip comment lines
60
+ if line.startswith(':'):
61
+ continue
62
+
63
+ # Parse field-value pairs
64
+ if ':' in line:
65
+ field, value = line.split(':', 1)
66
+ field = field.strip()
67
+ value = value.lstrip()
68
+
69
+ if field == 'data':
70
+ self.debug_log(f"收到数据 (第{self.line_count}行): {value}")
71
+
72
+ # Try to parse JSON
73
+ try:
74
+ data = json.loads(value)
75
+ yield {
76
+ 'type': 'data',
77
+ 'data': data,
78
+ 'raw': value
79
+ }
80
+ except json.JSONDecodeError:
81
+ yield {
82
+ 'type': 'data',
83
+ 'data': value,
84
+ 'raw': value,
85
+ 'is_json': False
86
+ }
87
+
88
+ elif field == 'event':
89
+ yield {'type': 'event', 'event': value}
90
+
91
+ elif field == 'id':
92
+ yield {'type': 'id', 'id': value}
93
+
94
+ elif field == 'retry':
95
+ try:
96
+ retry = int(value)
97
+ yield {'type': 'retry', 'retry': retry}
98
+ except ValueError:
99
+ self.debug_log(f"无效的 retry 值: {value}")
100
+
101
+ def iter_data_only(self) -> Generator[Dict[str, Any], None, None]:
102
+ """Iterate only over data events"""
103
+ for event in self.iter_events():
104
+ if event['type'] == 'data':
105
+ yield event
106
+
107
+ def iter_json_data(self, model_class: Optional[Type] = None) -> Generator[Dict[str, Any], None, None]:
108
+ """Iterate only over JSON data events with optional validation
109
+
110
+ Args:
111
+ model_class: Optional Pydantic model class for validation
112
+
113
+ Yields:
114
+ dict: JSON data events
115
+ """
116
+ for event in self.iter_events():
117
+ if event['type'] == 'data' and event.get('is_json', True):
118
+ try:
119
+ if model_class:
120
+ data = model_class.model_validate_json(event['raw'])
121
+ yield {
122
+ 'type': 'data',
123
+ 'data': data,
124
+ 'raw': event['raw']
125
+ }
126
+ else:
127
+ yield event
128
+ except Exception as e:
129
+ self.debug_log(f"数据验证失败: {e}")
130
+ continue
131
+
132
+ def close(self) -> None:
133
+ """Close the response connection"""
134
+ if hasattr(self.response, 'close'):
135
+ self.response.close()
136
+
137
+ def __enter__(self):
138
+ """Context manager entry"""
139
+ return self
140
+
141
+ def __exit__(self, exc_type, exc_val, exc_tb) -> None:
142
+ """Context manager exit"""
143
+ self.close()
app/utils/tools.py ADDED
@@ -0,0 +1,225 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Tool processing utilities
3
+ """
4
+
5
+ import json
6
+ import re
7
+ import time
8
+ from typing import Dict, List, Optional, Any
9
+
10
+ from app.core.config import settings
11
+
12
+
13
+ def generate_tool_prompt(tools: List[Dict[str, Any]]) -> str:
14
+ """Generate tool injection prompt with enhanced formatting"""
15
+ if not tools:
16
+ return ""
17
+
18
+ tool_definitions = []
19
+ for tool in tools:
20
+ if tool.get("type") != "function":
21
+ continue
22
+
23
+ function_spec = tool.get("function", {}) or {}
24
+ function_name = function_spec.get("name", "unknown")
25
+ function_description = function_spec.get("description", "")
26
+ parameters = function_spec.get("parameters", {}) or {}
27
+
28
+ # Create structured tool definition
29
+ tool_info = [f"## {function_name}", f"**Purpose**: {function_description}"]
30
+
31
+ # Add parameter details
32
+ parameter_properties = parameters.get("properties", {}) or {}
33
+ required_parameters = set(parameters.get("required", []) or [])
34
+
35
+ if parameter_properties:
36
+ tool_info.append("**Parameters**:")
37
+ for param_name, param_details in parameter_properties.items():
38
+ param_type = (param_details or {}).get("type", "any")
39
+ param_desc = (param_details or {}).get("description", "")
40
+ requirement_flag = "**Required**" if param_name in required_parameters else "*Optional*"
41
+ tool_info.append(f"- `{param_name}` ({param_type}) - {requirement_flag}: {param_desc}")
42
+
43
+ tool_definitions.append("\n".join(tool_info))
44
+
45
+ if not tool_definitions:
46
+ return ""
47
+
48
+ # Build comprehensive tool prompt
49
+ prompt_template = (
50
+ "\n\n# AVAILABLE FUNCTIONS\n" +
51
+ "\n\n---\n".join(tool_definitions) +
52
+ "\n\n# USAGE INSTRUCTIONS\n"
53
+ "When you need to execute a function, respond ONLY with a JSON object containing tool_calls:\n"
54
+ "```json\n"
55
+ "{\n"
56
+ ' "tool_calls": [\n'
57
+ " {\n"
58
+ ' "id": "call_" + unique_id,\n'
59
+ ' "type": "function",\n'
60
+ ' "function": {\n'
61
+ ' "name": "function_name",\n'
62
+ ' "arguments": {\n'
63
+ ' "param1": "value1"\n'
64
+ ' }\n'
65
+ " }\n"
66
+ " }\n"
67
+ " ]\n"
68
+ "}\n"
69
+ "```\n"
70
+ "Important: No explanatory text before or after the JSON.\n"
71
+ )
72
+
73
+ return prompt_template
74
+
75
+
76
+ def process_messages_with_tools(
77
+ messages: List[Dict[str, Any]],
78
+ tools: Optional[List[Dict[str, Any]]] = None,
79
+ tool_choice: Optional[Any] = None
80
+ ) -> List[Dict[str, Any]]:
81
+ """Process messages and inject tool prompts"""
82
+ processed: List[Dict[str, Any]] = []
83
+
84
+ if tools and settings.TOOL_SUPPORT and (tool_choice != "none"):
85
+ tools_prompt = generate_tool_prompt(tools)
86
+ has_system = any(m.get("role") == "system" for m in messages)
87
+
88
+ if has_system:
89
+ for m in messages:
90
+ if m.get("role") == "system":
91
+ mm = dict(m)
92
+ content = mm.get("content", "")
93
+ if content is None:
94
+ content = ""
95
+ mm["content"] = content + tools_prompt
96
+ processed.append(mm)
97
+ else:
98
+ processed.append(m)
99
+ else:
100
+ processed = [{"role": "system", "content": "你是一个有用的助手。" + tools_prompt}] + messages
101
+
102
+ # Add tool choice hints
103
+ if tool_choice in ("required", "auto"):
104
+ if processed and processed[-1].get("role") == "user":
105
+ last = dict(processed[-1])
106
+ content = last.get("content", "")
107
+ if content is None:
108
+ content = ""
109
+ last["content"] = content + "\n\n请根据需要使用提供的工具函数。"
110
+ processed[-1] = last
111
+ elif isinstance(tool_choice, dict) and tool_choice.get("type") == "function":
112
+ fname = (tool_choice.get("function") or {}).get("name")
113
+ if fname and processed and processed[-1].get("role") == "user":
114
+ last = dict(processed[-1])
115
+ content = last.get("content", "")
116
+ if content is None:
117
+ content = ""
118
+ last["content"] = content + f"\n\n请使用 {fname} 函数来处理这个请求。"
119
+ processed[-1] = last
120
+ else:
121
+ processed = list(messages)
122
+
123
+ # Handle tool/function messages
124
+ final_msgs: List[Dict[str, Any]] = []
125
+ for m in processed:
126
+ role = m.get("role")
127
+ if role in ("tool", "function"):
128
+ tool_name = m.get("name", "unknown")
129
+ tool_content = m.get("content", "")
130
+ if isinstance(tool_content, dict):
131
+ tool_content = json.dumps(tool_content, ensure_ascii=False)
132
+ elif tool_content is None:
133
+ tool_content = ""
134
+
135
+ # 确保内容不为空且不包含 None
136
+ content = f"工具 {tool_name} 返回结果:\n```json\n{tool_content}\n```"
137
+ if not content.strip():
138
+ content = f"工具 {tool_name} 执行完成"
139
+
140
+ final_msgs.append({
141
+ "role": "assistant",
142
+ "content": content,
143
+ })
144
+ else:
145
+ final_msgs.append(m)
146
+
147
+ return final_msgs
148
+
149
+
150
+ # Tool Extraction Patterns
151
+ TOOL_CALL_FENCE_PATTERN = re.compile(r"```json\s*(\{.*?\})\s*```", re.DOTALL)
152
+ TOOL_CALL_INLINE_PATTERN = re.compile(r"(\{[^{}]{0,10000}\"tool_calls\".*?\})", re.DOTALL)
153
+ FUNCTION_CALL_PATTERN = re.compile(r"调用函数\s*[::]\s*([\w\-\.]+)\s*(?:参数|arguments)[::]\s*(\{.*?\})", re.DOTALL)
154
+
155
+
156
+ def extract_tool_invocations(text: str) -> Optional[List[Dict[str, Any]]]:
157
+ """Extract tool invocations from response text"""
158
+ if not text:
159
+ return None
160
+
161
+ # Limit scan size for performance
162
+ scannable_text = text[:settings.SCAN_LIMIT]
163
+
164
+ # Attempt 1: Extract from JSON code blocks
165
+ json_blocks = TOOL_CALL_FENCE_PATTERN.findall(scannable_text)
166
+ for json_block in json_blocks:
167
+ try:
168
+ parsed_data = json.loads(json_block)
169
+ tool_calls = parsed_data.get("tool_calls")
170
+ if tool_calls and isinstance(tool_calls, list):
171
+ return tool_calls
172
+ except (json.JSONDecodeError, AttributeError):
173
+ continue
174
+
175
+ # Attempt 2: Extract inline JSON objects
176
+ inline_match = TOOL_CALL_INLINE_PATTERN.search(scannable_text)
177
+ if inline_match:
178
+ try:
179
+ inline_json = inline_match.group(1)
180
+ parsed_data = json.loads(inline_json)
181
+ tool_calls = parsed_data.get("tool_calls")
182
+ if tool_calls and isinstance(tool_calls, list):
183
+ return tool_calls
184
+ except (json.JSONDecodeError, AttributeError):
185
+ pass
186
+
187
+ # Attempt 3: Parse natural language function calls
188
+ natural_lang_match = FUNCTION_CALL_PATTERN.search(scannable_text)
189
+ if natural_lang_match:
190
+ function_name = natural_lang_match.group(1).strip()
191
+ arguments_str = natural_lang_match.group(2).strip()
192
+ try:
193
+ # Validate JSON format
194
+ json.loads(arguments_str)
195
+ return [{
196
+ "id": f"invoke_{int(time.time() * 1000000)}",
197
+ "type": "function",
198
+ "function": {
199
+ "name": function_name,
200
+ "arguments": arguments_str
201
+ }
202
+ }]
203
+ except json.JSONDecodeError:
204
+ return None
205
+
206
+ return None
207
+
208
+
209
+ def remove_tool_json_content(text: str) -> str:
210
+ """Remove tool JSON content from response text"""
211
+ def remove_tool_call_block(match: re.Match) -> str:
212
+ json_content = match.group(1)
213
+ try:
214
+ parsed_data = json.loads(json_content)
215
+ if "tool_calls" in parsed_data:
216
+ return ""
217
+ except (json.JSONDecodeError, AttributeError):
218
+ pass
219
+ return match.group(0)
220
+
221
+ # Remove fenced tool JSON blocks
222
+ cleaned_text = TOOL_CALL_FENCE_PATTERN.sub(remove_tool_call_block, text)
223
+ # Remove inline tool JSON
224
+ cleaned_text = TOOL_CALL_INLINE_PATTERN.sub("", cleaned_text)
225
+ return cleaned_text.strip()
main.py CHANGED
@@ -1,1120 +1,33 @@
1
- # -*- coding: utf-8 -*-
2
-
3
  """
4
- OpenAI Compatible API Server for Z.AI
5
- =====================================
6
-
7
- This module provides an OpenAI-compatible API server that forwards requests
8
- to the Z.AI chat service with proper authentication and response formatting.
9
  """
10
 
11
- import json
12
- import os
13
- import re
14
- import time
15
- import uuid
16
- from datetime import datetime
17
- from typing import Dict, List, Optional, Any, Union, Generator, Tuple, Literal
18
-
19
- import requests
20
- from fastapi import FastAPI, Request, Response, HTTPException, Header
21
- from fastapi.responses import StreamingResponse, JSONResponse
22
- from pydantic import BaseModel, Field
23
-
24
-
25
- # =============================================================================
26
- # Configuration Constants
27
- # =============================================================================
28
-
29
- class ServerConfig:
30
- """Centralized server configuration"""
31
-
32
- # API Configuration
33
- API_ENDPOINT: str = os.getenv("API_ENDPOINT", "https://chat.z.ai/api/chat/completions")
34
- AUTH_TOKEN: str = os.getenv("AUTH_TOKEN", "sk-your-api-key")
35
- ANTHROPIC_API_KEY: str = os.getenv("ANTHROPIC_API_KEY", AUTH_TOKEN)
36
- BACKUP_TOKEN: str = os.getenv("BACKUP_TOKEN", "eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6IjMxNmJjYjQ4LWZmMmYtNGExNS04NTNkLWYyYTI5YjY3ZmYwZiIsImVtYWlsIjoiR3Vlc3QtMTc1NTg0ODU4ODc4OEBndWVzdC5jb20ifQ.PktllDySS3trlyuFpTeIZf-7hl8Qu1qYF3BxjgIul0BrNux2nX9hVzIjthLXKMWAf9V0qM8Vm_iyDqkjPGsaiQ")
37
-
38
- # Model Configuration
39
- PRIMARY_MODEL: str = os.getenv("PRIMARY_MODEL", "GLM-4.5")
40
- THINKING_MODEL: str = os.getenv("THINKING_MODEL", "GLM-4.5-Thinking")
41
- SEARCH_MODEL: str = os.getenv("SEARCH_MODEL", "GLM-4.5-Search")
42
-
43
- # Server Configuration
44
- LISTEN_PORT: int = int(os.getenv("LISTEN_PORT", "8080"))
45
- DEBUG_LOGGING: bool = os.getenv("DEBUG_LOGGING", "true").lower() == "true"
46
-
47
- # Feature Configuration
48
- THINKING_PROCESSING: str = os.getenv("THINKING_PROCESSING", "think") # strip: 去除<details>标签;think: 转为</think>标签;raw: 保留原样
49
- ANONYMOUS_MODE: bool = os.getenv("ANONYMOUS_MODE", "true").lower() == "true"
50
- TOOL_SUPPORT: bool = os.getenv("TOOL_SUPPORT", "true").lower() == "true"
51
- SCAN_LIMIT: int = int(os.getenv("SCAN_LIMIT", "200000"))
52
-
53
- # Browser Headers
54
- CLIENT_HEADERS: Dict[str, str] = {
55
- "Content-Type": "application/json",
56
- "Accept": "application/json, text/event-stream",
57
- "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/139.0.0.0 Safari/537.36 Edg/139.0.0.0",
58
- "Accept-Language": "zh-CN",
59
- "sec-ch-ua": '"Not;A=Brand";v="99", "Microsoft Edge";v="139", "Chromium";v="139"',
60
- "sec-ch-ua-mobile": "?0",
61
- "sec-ch-ua-platform": '"Windows"',
62
- "X-FE-Version": "prod-fe-1.0.70",
63
- "Origin": "https://chat.z.ai",
64
- }
65
-
66
-
67
- # =============================================================================
68
- # Data Models
69
- # =============================================================================
70
-
71
- class Message(BaseModel):
72
- """Chat message model"""
73
- role: str
74
- content: Optional[str] = None
75
- reasoning_content: Optional[str] = None
76
- tool_calls: Optional[List[Dict[str, Any]]] = None
77
-
78
-
79
- class OpenAIRequest(BaseModel):
80
- """OpenAI-compatible request model"""
81
- model: str
82
- messages: List[Message]
83
- stream: Optional[bool] = False
84
- temperature: Optional[float] = None
85
- max_tokens: Optional[int] = None
86
- tools: Optional[List[Dict[str, Any]]] = None
87
- tool_choice: Optional[Any] = None
88
-
89
-
90
- class ModelItem(BaseModel):
91
- """Model information item"""
92
- id: str
93
- name: str
94
- owned_by: str
95
-
96
-
97
- class UpstreamRequest(BaseModel):
98
- """Upstream service request model"""
99
- stream: bool
100
- model: str
101
- messages: List[Message]
102
- params: Dict[str, Any] = {}
103
- features: Dict[str, Any] = {}
104
- background_tasks: Optional[Dict[str, bool]] = None
105
- chat_id: Optional[str] = None
106
- id: Optional[str] = None
107
- mcp_servers: Optional[List[str]] = None
108
- model_item: Optional[ModelItem] = None
109
- tool_servers: Optional[List[str]] = None
110
- variables: Optional[Dict[str, str]] = None
111
- model_config = {'protected_namespaces': ()}
112
-
113
-
114
- class Delta(BaseModel):
115
- """Stream delta model"""
116
- role: Optional[str] = None
117
- content: Optional[str] = None
118
- reasoning_content: Optional[str] = None
119
- tool_calls: Optional[List[Dict[str, Any]]] = None
120
-
121
-
122
- class Choice(BaseModel):
123
- """Response choice model"""
124
- index: int
125
- message: Optional[Message] = None
126
- delta: Optional[Delta] = None
127
- finish_reason: Optional[str] = None
128
-
129
-
130
- class Usage(BaseModel):
131
- """Token usage statistics"""
132
- prompt_tokens: int = 0
133
- completion_tokens: int = 0
134
- total_tokens: int = 0
135
-
136
-
137
- class OpenAIResponse(BaseModel):
138
- """OpenAI-compatible response model"""
139
- id: str
140
- object: str
141
- created: int
142
- model: str
143
- choices: List[Choice]
144
- usage: Optional[Usage] = None
145
-
146
-
147
- class UpstreamError(BaseModel):
148
- """Upstream error model"""
149
- detail: str
150
- code: int
151
-
152
-
153
- class UpstreamDataInner(BaseModel):
154
- """Inner upstream data model"""
155
- error: Optional[UpstreamError] = None
156
-
157
-
158
- class UpstreamDataData(BaseModel):
159
- """Upstream data content model"""
160
- delta_content: str = ""
161
- edit_content: str = ""
162
- phase: str = ""
163
- done: bool = False
164
- usage: Optional[Usage] = None
165
- error: Optional[UpstreamError] = None
166
- inner: Optional[UpstreamDataInner] = None
167
-
168
-
169
- class UpstreamData(BaseModel):
170
- """Upstream data model"""
171
- type: str
172
- data: UpstreamDataData
173
- error: Optional[UpstreamError] = None
174
-
175
-
176
- class Model(BaseModel):
177
- """Model information for listing"""
178
- id: str
179
- object: str = "model"
180
- created: int
181
- owned_by: str
182
-
183
-
184
- # ANTHROPIC API 兼容性模型
185
- class ContentBlock(BaseModel):
186
- type: str
187
- text: str
188
-
189
-
190
- class AnthropicMessage(BaseModel):
191
- role: Literal["user", "assistant"]
192
- content: Union[str, List[ContentBlock]]
193
-
194
-
195
- class AnthropicRequest(BaseModel):
196
- model: str
197
- messages: List[AnthropicMessage]
198
- system: Optional[Union[str, List[ContentBlock]]] = None
199
- max_tokens: int = 1024
200
- stream: bool = False
201
- temperature: Optional[float] = None
202
-
203
-
204
- class ModelsResponse(BaseModel):
205
- """Models list response model"""
206
- object: str = "list"
207
- data: List[Model]
208
-
209
-
210
- # ANTHROPIC API 兼容性函数
211
- def stream_anthropic_generator(upstream_response: requests.Response, request_id: str, requested_model: str):
212
- """生成 Anthropic 兼容的流式响应事件"""
213
- usage = {"input_tokens": 0, "output_tokens": 0}
214
-
215
- start_event = {
216
- "type": "message_start",
217
- "message": {
218
- "id": request_id,
219
- "type": "message",
220
- "role": "assistant",
221
- "content": [],
222
- "model": requested_model,
223
- "stop_reason": None,
224
- "stop_sequence": None,
225
- "usage": usage
226
- }
227
- }
228
- yield f"event: {start_event['type']}\ndata: {json.dumps(start_event['message'])}\n\n"
229
-
230
- # 发送 content_block_start 事件
231
- content_start_data = {
232
- "type": "content_block_start",
233
- "index": 0,
234
- "content_block": {
235
- "type": "text",
236
- "text": ""
237
- }
238
- }
239
- yield f"event: content_block_start\ndata: {json.dumps(content_start_data)}\n\n"
240
-
241
- # 处理上游响应
242
- for line in upstream_response.iter_lines():
243
- if not line.startswith(b"data:"): continue
244
- data_str = line[5:].strip()
245
- if not data_str: continue
246
- try:
247
- data = json.loads(data_str.decode('utf-8'))
248
- delta_content = data.get("data", {}).get("delta_content", "")
249
- phase = data.get("data", {}).get("phase", "")
250
-
251
- # 处理内容增量
252
- if delta_content:
253
- out_content = transform_thinking_content(delta_content) if phase == "thinking" else delta_content
254
- if out_content:
255
- usage["output_tokens"] += len(out_content) // 4 # 简单估算
256
- delta_data = {
257
- "type": "content_block_delta",
258
- "index": 0,
259
- "delta": {
260
- "type": "text_delta",
261
- "text": out_content
262
- }
263
- }
264
- yield f"event: content_block_delta\ndata: {json.dumps(delta_data)}\n\n"
265
-
266
- # 处理结束
267
- if data.get("data", {}).get("done", False) or phase == "done":
268
- # 发送 content_block_stop
269
- content_stop_data = {
270
- "type": "content_block_stop",
271
- "index": 0
272
- }
273
- yield f"event: content_block_stop\ndata: {json.dumps(content_stop_data)}\n\n"
274
-
275
- # 发送 message_delta
276
- message_delta_data = {
277
- "type": "message_delta",
278
- "delta": {
279
- "stop_reason": "end_turn",
280
- "stop_sequence": None,
281
- "usage": {
282
- "input_tokens": usage["input_tokens"],
283
- "output_tokens": usage["output_tokens"]
284
- }
285
- }
286
- }
287
- yield f"event: message_delta\ndata: {json.dumps(message_delta_data)}\n\n"
288
-
289
- # 发送 message_stop
290
- yield f"event: message_stop\ndata: {json.dumps({'type': 'message_stop'})}\n\n"
291
- break
292
-
293
- except json.JSONDecodeError:
294
- continue
295
-
296
-
297
- def transform_thinking_content(content: str) -> str:
298
- """Transform thinking content according to configuration"""
299
- # Remove summary tags
300
- content = re.sub(r'(?s)<summary>.*?</summary>', '', content)
301
- # Clean up remaining tags
302
- content = content.replace("</thinking>", "").replace("<Full>", "").replace("</Full>", "")
303
- content = content.strip()
304
-
305
- if ServerConfig.THINKING_PROCESSING == "think":
306
- content = re.sub(r'<details[^>]*>', '<think>', content)
307
- content = content.replace("</details>", "</think>")
308
- elif ServerConfig.THINKING_PROCESSING == "strip":
309
- content = re.sub(r'<details[^>]*>', '', content)
310
- content = content.replace("</details>", "")
311
-
312
- # Remove line prefixes
313
- content = content.lstrip("> ")
314
- content = content.replace("\n> ", "\n")
315
-
316
- return content.strip()
317
-
318
-
319
- # =============================================================================
320
- # SSE Parser
321
- # =============================================================================
322
 
323
- class SSEParser:
324
- """Server-Sent Events parser for streaming responses"""
325
-
326
- def __init__(self, response: requests.Response, debug_mode: bool = False):
327
- """Initialize SSE parser
328
-
329
- Args:
330
- response: requests.Response object with stream=True
331
- debug_mode: Enable debug logging
332
- """
333
- self.response = response
334
- self.debug_mode = debug_mode
335
- self.buffer = ""
336
- self.line_count = 0
337
-
338
- def debug_log(self, format_str: str, *args) -> None:
339
- """Log debug message if debug mode is enabled"""
340
- if self.debug_mode:
341
- if args:
342
- print(f"[SSE_PARSER] {format_str % args}")
343
- else:
344
- print(f"[SSE_PARSER] {format_str}")
345
-
346
- def iter_events(self) -> Generator[Dict[str, Any], None, None]:
347
- """Iterate over SSE events
348
-
349
- Yields:
350
- dict: Parsed SSE event data
351
- """
352
- self.debug_log("开始解析 SSE 流")
353
-
354
- for line in self.response.iter_lines():
355
- self.line_count += 1
356
-
357
- # Skip empty lines
358
- if not line:
359
- continue
360
-
361
- # Decode bytes
362
- if isinstance(line, bytes):
363
- try:
364
- line = line.decode('utf-8')
365
- except UnicodeDecodeError:
366
- self.debug_log(f"第{self.line_count}行解码失败,跳过")
367
- continue
368
-
369
- # Skip comment lines
370
- if line.startswith(':'):
371
- continue
372
-
373
- # Parse field-value pairs
374
- if ':' in line:
375
- field, value = line.split(':', 1)
376
- field = field.strip()
377
- value = value.lstrip()
378
-
379
- if field == 'data':
380
- self.debug_log(f"收到数据 (第{self.line_count}行): {value}")
381
-
382
- # Try to parse JSON
383
- try:
384
- data = json.loads(value)
385
- yield {
386
- 'type': 'data',
387
- 'data': data,
388
- 'raw': value
389
- }
390
- except json.JSONDecodeError:
391
- yield {
392
- 'type': 'data',
393
- 'data': value,
394
- 'raw': value,
395
- 'is_json': False
396
- }
397
-
398
- elif field == 'event':
399
- yield {'type': 'event', 'event': value}
400
-
401
- elif field == 'id':
402
- yield {'type': 'id', 'id': value}
403
-
404
- elif field == 'retry':
405
- try:
406
- retry = int(value)
407
- yield {'type': 'retry', 'retry': retry}
408
- except ValueError:
409
- self.debug_log(f"无效的 retry 值: {value}")
410
-
411
- def iter_data_only(self) -> Generator[Dict[str, Any], None, None]:
412
- """Iterate only over data events"""
413
- for event in self.iter_events():
414
- if event['type'] == 'data':
415
- yield event
416
-
417
- def iter_json_data(self, model_class: Optional[type] = None) -> Generator[Dict[str, Any], None, None]:
418
- """Iterate only over JSON data events with optional validation
419
-
420
- Args:
421
- model_class: Optional Pydantic model class for validation
422
-
423
- Yields:
424
- dict: JSON data events
425
- """
426
- for event in self.iter_events():
427
- if event['type'] == 'data' and event.get('is_json', True):
428
- try:
429
- if model_class:
430
- data = model_class.model_validate_json(event['raw'])
431
- yield {
432
- 'type': 'data',
433
- 'data': data,
434
- 'raw': event['raw']
435
- }
436
- else:
437
- yield event
438
- except Exception as e:
439
- self.debug_log(f"数据验证失败: {e}")
440
- continue
441
-
442
- def close(self) -> None:
443
- """Close the response connection"""
444
- if hasattr(self.response, 'close'):
445
- self.response.close()
446
-
447
- def __enter__(self):
448
- """Context manager entry"""
449
- return self
450
-
451
- def __exit__(self, exc_type, exc_val, exc_tb) -> None:
452
- """Context manager exit"""
453
- self.close()
454
-
455
-
456
- # =============================================================================
457
- # Function Call Utilities
458
- # =============================================================================
459
-
460
- def generate_tool_prompt(tools: List[Dict[str, Any]]) -> str:
461
- """Generate tool injection prompt with enhanced formatting"""
462
- if not tools:
463
- return ""
464
-
465
- tool_definitions = []
466
- for tool in tools:
467
- if tool.get("type") != "function":
468
- continue
469
-
470
- function_spec = tool.get("function", {}) or {}
471
- function_name = function_spec.get("name", "unknown")
472
- function_description = function_spec.get("description", "")
473
- parameters = function_spec.get("parameters", {}) or {}
474
-
475
- # Create structured tool definition
476
- tool_info = [f"## {function_name}", f"**Purpose**: {function_description}"]
477
-
478
- # Add parameter details
479
- parameter_properties = parameters.get("properties", {}) or {}
480
- required_parameters = set(parameters.get("required", []) or [])
481
-
482
- if parameter_properties:
483
- tool_info.append("**Parameters**:")
484
- for param_name, param_details in parameter_properties.items():
485
- param_type = (param_details or {}).get("type", "any")
486
- param_desc = (param_details or {}).get("description", "")
487
- requirement_flag = "**Required**" if param_name in required_parameters else "*Optional*"
488
- tool_info.append(f"- `{param_name}` ({param_type}) - {requirement_flag}: {param_desc}")
489
-
490
- tool_definitions.append("\n".join(tool_info))
491
-
492
- if not tool_definitions:
493
- return ""
494
-
495
- # Build comprehensive tool prompt
496
- prompt_template = (
497
- "\n\n# AVAILABLE FUNCTIONS\n" +
498
- "\n\n---\n".join(tool_definitions) +
499
- "\n\n# USAGE INSTRUCTIONS\n"
500
- "When you need to execute a function, respond ONLY with a JSON object containing tool_calls:\n"
501
- "```json\n"
502
- "{\n"
503
- ' "tool_calls": [\n'
504
- " {\n"
505
- ' "id": "call_" + unique_id,\n'
506
- ' "type": "function",\n'
507
- ' "function": {\n'
508
- ' "name": "function_name",\n'
509
- ' "arguments": {\n'
510
- ' "param1": "value1"\n'
511
- ' }\n'
512
- " }\n"
513
- " }\n"
514
- " ]\n"
515
- "}\n"
516
- "```\n"
517
- "Important: No explanatory text before or after the JSON.\n"
518
- )
519
-
520
- return prompt_template
521
-
522
-
523
- def process_messages_with_tools(
524
- messages: List[Dict[str, Any]],
525
- tools: Optional[List[Dict[str, Any]]] = None,
526
- tool_choice: Optional[Any] = None
527
- ) -> List[Dict[str, Any]]:
528
- """Process messages and inject tool prompts"""
529
- processed: List[Dict[str, Any]] = []
530
-
531
- if tools and ServerConfig.TOOL_SUPPORT and (tool_choice != "none"):
532
- tools_prompt = generate_tool_prompt(tools)
533
- has_system = any(m.get("role") == "system" for m in messages)
534
-
535
- if has_system:
536
- for m in messages:
537
- if m.get("role") == "system":
538
- mm = dict(m)
539
- content = mm.get("content", "")
540
- if content is None:
541
- content = ""
542
- mm["content"] = content + tools_prompt
543
- processed.append(mm)
544
- else:
545
- processed.append(m)
546
- else:
547
- processed = [{"role": "system", "content": "你是一个有用的助手。" + tools_prompt}] + messages
548
-
549
- # Add tool choice hints
550
- if tool_choice in ("required", "auto"):
551
- if processed and processed[-1].get("role") == "user":
552
- last = dict(processed[-1])
553
- content = last.get("content", "")
554
- if content is None:
555
- content = ""
556
- last["content"] = content + "\n\n请根据需要使用提供的工具函数。"
557
- processed[-1] = last
558
- elif isinstance(tool_choice, dict) and tool_choice.get("type") == "function":
559
- fname = (tool_choice.get("function") or {}).get("name")
560
- if fname and processed and processed[-1].get("role") == "user":
561
- last = dict(processed[-1])
562
- content = last.get("content", "")
563
- if content is None:
564
- content = ""
565
- last["content"] = content + f"\n\n请使用 {fname} 函数来处���这个请求。"
566
- processed[-1] = last
567
- else:
568
- processed = list(messages)
569
-
570
- # Handle tool/function messages
571
- final_msgs: List[Dict[str, Any]] = []
572
- for m in processed:
573
- role = m.get("role")
574
- if role in ("tool", "function"):
575
- tool_name = m.get("name", "unknown")
576
- tool_content = m.get("content", "")
577
- if isinstance(tool_content, dict):
578
- tool_content = json.dumps(tool_content, ensure_ascii=False)
579
- elif tool_content is None:
580
- tool_content = ""
581
-
582
- # 确保内容不为空且不包含 None
583
- content = f"工具 {tool_name} 返回结果:\n```json\n{tool_content}\n```"
584
- if not content.strip():
585
- content = f"工具 {tool_name} 执行完成"
586
-
587
- final_msgs.append({
588
- "role": "assistant",
589
- "content": content,
590
- })
591
- else:
592
- final_msgs.append(m)
593
-
594
- return final_msgs
595
-
596
-
597
- # Tool Extraction Patterns
598
- TOOL_CALL_FENCE_PATTERN = re.compile(r"```json\s*(\{.*?\})\s*```", re.DOTALL)
599
- TOOL_CALL_INLINE_PATTERN = re.compile(r"(\{[^{}]{0,10000}\"tool_calls\".*?\})", re.DOTALL)
600
- FUNCTION_CALL_PATTERN = re.compile(r"调用函数\s*[::]\s*([\w\-\.]+)\s*(?:参数|arguments)[::]\s*(\{.*?\})", re.DOTALL)
601
-
602
-
603
- def extract_tool_invocations(text: str) -> Optional[List[Dict[str, Any]]]:
604
- """Extract tool invocations from response text"""
605
- if not text:
606
- return None
607
-
608
- # Limit scan size for performance
609
- scannable_text = text[:ServerConfig.SCAN_LIMIT]
610
-
611
- # Attempt 1: Extract from JSON code blocks
612
- json_blocks = TOOL_CALL_FENCE_PATTERN.findall(scannable_text)
613
- for json_block in json_blocks:
614
- try:
615
- parsed_data = json.loads(json_block)
616
- tool_calls = parsed_data.get("tool_calls")
617
- if tool_calls and isinstance(tool_calls, list):
618
- return tool_calls
619
- except (json.JSONDecodeError, AttributeError):
620
- continue
621
-
622
- # Attempt 2: Extract inline JSON objects
623
- inline_match = TOOL_CALL_INLINE_PATTERN.search(scannable_text)
624
- if inline_match:
625
- try:
626
- inline_json = inline_match.group(1)
627
- parsed_data = json.loads(inline_json)
628
- tool_calls = parsed_data.get("tool_calls")
629
- if tool_calls and isinstance(tool_calls, list):
630
- return tool_calls
631
- except (json.JSONDecodeError, AttributeError):
632
- pass
633
-
634
- # Attempt 3: Parse natural language function calls
635
- natural_lang_match = FUNCTION_CALL_PATTERN.search(scannable_text)
636
- if natural_lang_match:
637
- function_name = natural_lang_match.group(1).strip()
638
- arguments_str = natural_lang_match.group(2).strip()
639
- try:
640
- # Validate JSON format
641
- json.loads(arguments_str)
642
- return [{
643
- "id": f"invoke_{int(time.time() * 1000000)}",
644
- "type": "function",
645
- "function": {
646
- "name": function_name,
647
- "arguments": arguments_str
648
- }
649
- }]
650
- except json.JSONDecodeError:
651
- return None
652
-
653
- return None
654
-
655
-
656
- def remove_tool_json_content(text: str) -> str:
657
- """Remove tool JSON content from response text"""
658
- def remove_tool_call_block(match: re.Match) -> str:
659
- json_content = match.group(1)
660
- try:
661
- parsed_data = json.loads(json_content)
662
- if "tool_calls" in parsed_data:
663
- return ""
664
- except (json.JSONDecodeError, AttributeError):
665
- pass
666
- return match.group(0)
667
-
668
- # Remove fenced tool JSON blocks
669
- cleaned_text = TOOL_CALL_FENCE_PATTERN.sub(remove_tool_call_block, text)
670
- # Remove inline tool JSON
671
- cleaned_text = TOOL_CALL_INLINE_PATTERN.sub("", cleaned_text)
672
- return cleaned_text.strip()
673
-
674
-
675
- # =============================================================================
676
- # Utility Functions
677
- # =============================================================================
678
-
679
- def debug_log(message: str, *args) -> None:
680
- """Log debug message if debug mode is enabled"""
681
- if ServerConfig.DEBUG_LOGGING:
682
- if args:
683
- print(f"[DEBUG] {message % args}")
684
- else:
685
- print(f"[DEBUG] {message}")
686
-
687
-
688
- def generate_request_ids() -> Tuple[str, str]:
689
- """Generate unique IDs for chat and message"""
690
- timestamp = int(time.time())
691
- chat_id = f"{timestamp * 1000}-{timestamp}"
692
- msg_id = str(timestamp * 1000000)
693
- return chat_id, msg_id
694
-
695
-
696
- def get_browser_headers(referer_chat_id: str = "") -> Dict[str, str]:
697
- """Get browser headers for API requests"""
698
- headers = ServerConfig.CLIENT_HEADERS.copy()
699
-
700
- if referer_chat_id:
701
- headers["Referer"] = f"{ServerConfig.CLIENT_HEADERS['Origin']}/c/{referer_chat_id}"
702
-
703
- return headers
704
-
705
-
706
- def get_anonymous_token() -> str:
707
- """Get anonymous token for authentication"""
708
- headers = get_browser_headers()
709
- headers.update({
710
- "Accept": "*/*",
711
- "Accept-Language": "zh-CN,zh;q=0.9",
712
- "Referer": f"{ServerConfig.CLIENT_HEADERS['Origin']}/",
713
- })
714
-
715
- try:
716
- response = requests.get(
717
- f"{ServerConfig.CLIENT_HEADERS['Origin']}/api/v1/auths/",
718
- headers=headers,
719
- timeout=10.0
720
- )
721
-
722
- if response.status_code != 200:
723
- raise Exception(f"anon token status={response.status_code}")
724
-
725
- data = response.json()
726
- token = data.get("token")
727
- if not token:
728
- raise Exception("anon token empty")
729
-
730
- return token
731
- except Exception as e:
732
- debug_log(f"获取匿名token失败: {e}")
733
- raise
734
-
735
-
736
- def get_auth_token() -> str:
737
- """Get authentication token (anonymous or fixed)"""
738
- if ServerConfig.ANONYMOUS_MODE:
739
- try:
740
- token = get_anonymous_token()
741
- debug_log(f"匿名token获取成功: {token[:10]}...")
742
- return token
743
- except Exception as e:
744
- debug_log(f"匿名token获取失败,回退固定token: {e}")
745
-
746
- return ServerConfig.BACKUP_TOKEN
747
-
748
-
749
-
750
- def create_openai_response_chunk(
751
- model: str,
752
- delta: Optional[Delta] = None,
753
- finish_reason: Optional[str] = None
754
- ) -> OpenAIResponse:
755
- """Create OpenAI response chunk for streaming"""
756
- return OpenAIResponse(
757
- id=f"chatcmpl-{int(time.time())}",
758
- object="chat.completion.chunk",
759
- created=int(time.time()),
760
- model=model,
761
- choices=[Choice(
762
- index=0,
763
- delta=delta or Delta(),
764
- finish_reason=finish_reason
765
- )]
766
- )
767
-
768
-
769
- def handle_upstream_error(error: UpstreamError) -> Generator[str, None, None]:
770
- """Handle upstream error response"""
771
- debug_log(f"上游错误: code={error.code}, detail={error.detail}")
772
-
773
- # Send end chunk
774
- end_chunk = create_openai_response_chunk(
775
- model=ServerConfig.PRIMARY_MODEL,
776
- finish_reason="stop"
777
- )
778
- yield f"data: {end_chunk.model_dump_json()}\n\n"
779
- yield "data: [DONE]\n\n"
780
-
781
-
782
- def call_upstream_api(
783
- upstream_req: UpstreamRequest,
784
- chat_id: str,
785
- auth_token: str
786
- ) -> requests.Response:
787
- """Call upstream API with proper headers"""
788
- headers = get_browser_headers(chat_id)
789
- headers["Authorization"] = f"Bearer {auth_token}"
790
-
791
- debug_log(f"调用上游API: {ServerConfig.API_ENDPOINT}")
792
- debug_log(f"上游请求体: {upstream_req.model_dump_json()}")
793
-
794
- response = requests.post(
795
- ServerConfig.API_ENDPOINT,
796
- json=upstream_req.model_dump(exclude_none=True),
797
- headers=headers,
798
- timeout=60.0,
799
- stream=True
800
- )
801
-
802
- debug_log(f"上游响应状态: {response.status_code}")
803
- return response
804
-
805
-
806
- # =============================================================================
807
- # Response Handlers
808
- # =============================================================================
809
-
810
- class ResponseHandler:
811
- """Base class for response handling"""
812
-
813
- def __init__(self, upstream_req: UpstreamRequest, chat_id: str, auth_token: str):
814
- self.upstream_req = upstream_req
815
- self.chat_id = chat_id
816
- self.auth_token = auth_token
817
-
818
- def _call_upstream(self) -> requests.Response:
819
- """Call upstream API with error handling"""
820
- try:
821
- return call_upstream_api(self.upstream_req, self.chat_id, self.auth_token)
822
- except Exception as e:
823
- debug_log(f"调用上游失败: {e}")
824
- raise
825
-
826
- def _handle_upstream_error(self, response: requests.Response) -> None:
827
- """Handle upstream error response"""
828
- debug_log(f"上游返回错误状态: {response.status_code}")
829
- if ServerConfig.DEBUG_LOGGING:
830
- debug_log(f"上游错误响应: {response.text}")
831
-
832
-
833
- class StreamResponseHandler(ResponseHandler):
834
- """Handler for streaming responses"""
835
-
836
- def __init__(self, upstream_req: UpstreamRequest, chat_id: str, auth_token: str, has_tools: bool = False):
837
- super().__init__(upstream_req, chat_id, auth_token)
838
- self.has_tools = has_tools
839
- self.buffered_content = ""
840
- self.tool_calls = None
841
-
842
- def handle(self) -> Generator[str, None, None]:
843
- """Handle streaming response"""
844
- debug_log(f"开始处理流式响应 (chat_id={self.chat_id})")
845
-
846
- try:
847
- response = self._call_upstream()
848
- except Exception:
849
- yield "data: {\"error\": \"Failed to call upstream\"}\n\n"
850
- return
851
-
852
- if response.status_code != 200:
853
- self._handle_upstream_error(response)
854
- yield "data: {\"error\": \"Upstream error\"}\n\n"
855
- return
856
-
857
- # Send initial role chunk
858
- first_chunk = create_openai_response_chunk(
859
- model=ServerConfig.PRIMARY_MODEL,
860
- delta=Delta(role="assistant")
861
- )
862
- yield f"data: {first_chunk.model_dump_json()}\n\n"
863
-
864
- # Process stream
865
- debug_log("开始读取上游SSE流")
866
- sent_initial_answer = False
867
-
868
- with SSEParser(response, debug_mode=ServerConfig.DEBUG_LOGGING) as parser:
869
- for event in parser.iter_json_data(UpstreamData):
870
- upstream_data = event['data']
871
-
872
- # Check for errors
873
- if self._has_error(upstream_data):
874
- error = self._get_error(upstream_data)
875
- yield from handle_upstream_error(error)
876
- break
877
-
878
- debug_log(f"解析成功 - 类型: {upstream_data.type}, 阶段: {upstream_data.data.phase}, "
879
- f"内容长度: {len(upstream_data.data.delta_content)}, 完成: {upstream_data.data.done}")
880
-
881
- # Process content
882
- yield from self._process_content(upstream_data, sent_initial_answer)
883
-
884
- # Check if done
885
- if upstream_data.data.done or upstream_data.data.phase == "done":
886
- debug_log("检测到流结束信号")
887
- yield from self._send_end_chunk()
888
- break
889
-
890
- def _has_error(self, upstream_data: UpstreamData) -> bool:
891
- """Check if upstream data contains error"""
892
- return bool(
893
- upstream_data.error or
894
- upstream_data.data.error or
895
- (upstream_data.data.inner and upstream_data.data.inner.error)
896
- )
897
-
898
- def _get_error(self, upstream_data: UpstreamData) -> UpstreamError:
899
- """Get error from upstream data"""
900
- return (
901
- upstream_data.error or
902
- upstream_data.data.error or
903
- (upstream_data.data.inner.error if upstream_data.data.inner else None)
904
- )
905
-
906
- def _process_content(
907
- self,
908
- upstream_data: UpstreamData,
909
- sent_initial_answer: bool
910
- ) -> Generator[str, None, None]:
911
- """Process content from upstream data"""
912
- content = upstream_data.data.delta_content or upstream_data.data.edit_content
913
-
914
- if not content:
915
- return
916
-
917
- # Transform thinking content
918
- if upstream_data.data.phase == "thinking":
919
- content = transform_thinking_content(content)
920
-
921
- # Buffer content if tools are enabled
922
- if self.has_tools:
923
- self.buffered_content += content
924
- else:
925
- # Handle initial answer content
926
- if (not sent_initial_answer and
927
- upstream_data.data.edit_content and
928
- upstream_data.data.phase == "answer"):
929
-
930
- content = self._extract_edit_content(upstream_data.data.edit_content)
931
- if content:
932
- debug_log(f"发送普通内容: {content}")
933
- chunk = create_openai_response_chunk(
934
- model=ServerConfig.PRIMARY_MODEL,
935
- delta=Delta(content=content)
936
- )
937
- yield f"data: {chunk.model_dump_json()}\n\n"
938
- sent_initial_answer = True
939
-
940
- # Handle delta content
941
- if upstream_data.data.delta_content:
942
- if content:
943
- if upstream_data.data.phase == "thinking":
944
- debug_log(f"发送思考内容: {content}")
945
- chunk = create_openai_response_chunk(
946
- model=ServerConfig.PRIMARY_MODEL,
947
- delta=Delta(reasoning_content=content)
948
- )
949
- else:
950
- debug_log(f"发送普通内容: {content}")
951
- chunk = create_openai_response_chunk(
952
- model=ServerConfig.PRIMARY_MODEL,
953
- delta=Delta(content=content)
954
- )
955
- yield f"data: {chunk.model_dump_json()}\n\n"
956
-
957
- def _extract_edit_content(self, edit_content: str) -> str:
958
- """Extract content from edit_content field"""
959
- parts = edit_content.split("</details>")
960
- return parts[1] if len(parts) > 1 else ""
961
-
962
- def _send_end_chunk(self) -> Generator[str, None, None]:
963
- """Send end chunk and DONE signal"""
964
- if self.has_tools:
965
- # Try to extract tool calls from buffered content
966
- self.tool_calls = extract_tool_invocations(self.buffered_content)
967
-
968
- if self.tool_calls:
969
- # Send tool calls
970
- tool_calls_list = []
971
- for i, tc in enumerate(self.tool_calls):
972
- tool_calls_list.append({
973
- "index": i,
974
- "id": tc.get("id"),
975
- "type": tc.get("type", "function"),
976
- "function": tc.get("function", {}),
977
- })
978
-
979
- out_chunk = create_openai_response_chunk(
980
- model=ServerConfig.PRIMARY_MODEL,
981
- delta=Delta(tool_calls=tool_calls_list)
982
- )
983
- yield f"data: {out_chunk.model_dump_json()}\n\n"
984
- finish_reason = "tool_calls"
985
- else:
986
- # Send regular content
987
- trimmed_content = remove_tool_json_content(self.buffered_content)
988
- if trimmed_content:
989
- content_chunk = create_openai_response_chunk(
990
- model=ServerConfig.PRIMARY_MODEL,
991
- delta=Delta(content=trimmed_content)
992
- )
993
- yield f"data: {content_chunk.model_dump_json()}\n\n"
994
- finish_reason = "stop"
995
- else:
996
- finish_reason = "stop"
997
-
998
- # Send final chunk
999
- end_chunk = create_openai_response_chunk(
1000
- model=ServerConfig.PRIMARY_MODEL,
1001
- finish_reason=finish_reason
1002
- )
1003
- yield f"data: {end_chunk.model_dump_json()}\n\n"
1004
- yield "data: [DONE]\n\n"
1005
- debug_log("流式响应完成")
1006
-
1007
-
1008
- class NonStreamResponseHandler(ResponseHandler):
1009
- """Handler for non-streaming responses"""
1010
-
1011
- def __init__(self, upstream_req: UpstreamRequest, chat_id: str, auth_token: str, has_tools: bool = False):
1012
- super().__init__(upstream_req, chat_id, auth_token)
1013
- self.has_tools = has_tools
1014
-
1015
- def handle(self) -> JSONResponse:
1016
- """Handle non-streaming response"""
1017
- debug_log(f"开始处理非流式响应 (chat_id={self.chat_id})")
1018
-
1019
- try:
1020
- response = self._call_upstream()
1021
- except Exception as e:
1022
- debug_log(f"调用上游失败: {e}")
1023
- raise HTTPException(status_code=502, detail="Failed to call upstream")
1024
-
1025
- if response.status_code != 200:
1026
- self._handle_upstream_error(response)
1027
- raise HTTPException(status_code=502, detail="Upstream error")
1028
-
1029
- # Collect full response
1030
- full_content = []
1031
- debug_log("开始收集完整响应内容")
1032
-
1033
- with SSEParser(response, debug_mode=ServerConfig.DEBUG_LOGGING) as parser:
1034
- for event in parser.iter_json_data(UpstreamData):
1035
- upstream_data = event['data']
1036
-
1037
- if upstream_data.data.delta_content:
1038
- content = upstream_data.data.delta_content
1039
-
1040
- if upstream_data.data.phase == "thinking":
1041
- content = transform_thinking_content(content)
1042
-
1043
- if content:
1044
- full_content.append(content)
1045
-
1046
- if upstream_data.data.done or upstream_data.data.phase == "done":
1047
- debug_log("检测到完成信号,停止收集")
1048
- break
1049
-
1050
- final_content = "".join(full_content)
1051
- debug_log(f"内容收集完成,最终长度: {len(final_content)}")
1052
-
1053
- # Handle tool calls for non-streaming
1054
- tool_calls = None
1055
- finish_reason = "stop"
1056
- message_content = final_content
1057
-
1058
- if self.has_tools:
1059
- tool_calls = extract_tool_invocations(final_content)
1060
- if tool_calls:
1061
- # Content must be null when tool_calls are present (OpenAI spec)
1062
- message_content = None
1063
- finish_reason = "tool_calls"
1064
- else:
1065
- # Remove tool JSON from content
1066
- message_content = remove_tool_json_content(final_content)
1067
-
1068
- # Build response
1069
- response_data = OpenAIResponse(
1070
- id=f"chatcmpl-{int(time.time())}",
1071
- object="chat.completion",
1072
- created=int(time.time()),
1073
- model=ServerConfig.PRIMARY_MODEL,
1074
- choices=[Choice(
1075
- index=0,
1076
- message=Message(
1077
- role="assistant",
1078
- content=message_content,
1079
- tool_calls=tool_calls
1080
- ),
1081
- finish_reason=finish_reason
1082
- )],
1083
- usage=Usage()
1084
- )
1085
-
1086
- debug_log("非流式响应发送完成")
1087
- return JSONResponse(content=response_data.model_dump(exclude_none=True))
1088
-
1089
-
1090
- # =============================================================================
1091
- # FastAPI Application
1092
- # =============================================================================
1093
 
 
1094
  app = FastAPI(
1095
  title="OpenAI Compatible API Server",
1096
  description="An OpenAI-compatible API server for Z.AI chat service",
1097
  version="1.0.0"
1098
  )
1099
 
 
 
 
 
 
 
 
 
1100
 
1101
- # CORS middleware
1102
- @app.middleware("http")
1103
- async def add_cors_headers(request: Request, call_next):
1104
- """Add CORS headers to responses"""
1105
- response = await call_next(request)
1106
- response.headers.update({
1107
- "Access-Control-Allow-Origin": "*",
1108
- "Access-Control-Allow-Methods": "GET, POST, PUT, DELETE, OPTIONS",
1109
- "Access-Control-Allow-Headers": "Content-Type, Authorization",
1110
- "Access-Control-Allow-Credentials": "true"
1111
- })
1112
- return response
1113
-
1114
 
1115
- # =============================================================================
1116
- # API Endpoints
1117
- # =============================================================================
1118
 
1119
  @app.options("/")
1120
  async def handle_options():
@@ -1128,319 +41,6 @@ async def root():
1128
  return {"message": "OpenAI Compatible API Server"}
1129
 
1130
 
1131
- @app.get("/v1/models")
1132
- async def list_models():
1133
- """List available models"""
1134
- current_time = int(time.time())
1135
- response = ModelsResponse(
1136
- data=[
1137
- Model(
1138
- id=ServerConfig.PRIMARY_MODEL,
1139
- created=current_time,
1140
- owned_by="z.ai"
1141
- ),
1142
- Model(
1143
- id=ServerConfig.THINKING_MODEL,
1144
- created=current_time,
1145
- owned_by="z.ai"
1146
- ),
1147
- Model(
1148
- id=ServerConfig.SEARCH_MODEL,
1149
- created=current_time,
1150
- owned_by="z.ai"
1151
- ),
1152
- ]
1153
- )
1154
- return response
1155
-
1156
-
1157
- @app.post("/v1/chat/completions")
1158
- async def chat_completions(
1159
- request: OpenAIRequest,
1160
- authorization: str = Header(...)
1161
- ):
1162
- """Handle chat completion requests"""
1163
- debug_log("收到chat completions请求")
1164
-
1165
- try:
1166
- # Validate API key
1167
- if not authorization.startswith("Bearer "):
1168
- debug_log("缺少或无效的Authorization头")
1169
- raise HTTPException(status_code=401, detail="Missing or invalid Authorization header")
1170
-
1171
- api_key = authorization[7:]
1172
- if api_key != ServerConfig.AUTH_TOKEN:
1173
- debug_log(f"无效的API key: {api_key}")
1174
- raise HTTPException(status_code=401, detail="Invalid API key")
1175
-
1176
- debug_log(f"API key验证通过,AUTH_TOKEN={api_key[:8]}......")
1177
- debug_log(f"请求解析成功 - 模型: {request.model}, 流式: {request.stream}, 消息数: {len(request.messages)}")
1178
-
1179
- # Generate IDs
1180
- chat_id, msg_id = generate_request_ids()
1181
-
1182
- # Process messages with tools
1183
- processed_messages = process_messages_with_tools(
1184
- [m.model_dump() for m in request.messages],
1185
- request.tools,
1186
- request.tool_choice
1187
- )
1188
-
1189
- # Convert back to Message objects
1190
- upstream_messages = []
1191
- for msg in processed_messages:
1192
- content = msg.get("content")
1193
- # Ensure content is not None for Message model
1194
- if content is None:
1195
- content = ""
1196
-
1197
- upstream_messages.append(Message(
1198
- role=msg["role"],
1199
- content=content,
1200
- reasoning_content=msg.get("reasoning_content")
1201
- ))
1202
-
1203
- # Determine model features
1204
- is_thinking = request.model == ServerConfig.THINKING_MODEL
1205
- is_search = request.model == ServerConfig.SEARCH_MODEL
1206
- search_mcp = "deep-web-search" if is_search else ""
1207
-
1208
- # Build upstream request
1209
- upstream_req = UpstreamRequest(
1210
- stream=True, # Always use streaming from upstream
1211
- chat_id=chat_id,
1212
- id=msg_id,
1213
- model="0727-360B-API", # Actual upstream model ID
1214
- messages=upstream_messages,
1215
- params={},
1216
- features={
1217
- "enable_thinking": is_thinking,
1218
- "web_search": is_search,
1219
- "auto_web_search": is_search,
1220
- },
1221
- background_tasks={
1222
- "title_generation": False,
1223
- "tags_generation": False,
1224
- },
1225
- mcp_servers=[search_mcp] if search_mcp else [],
1226
- model_item=ModelItem(
1227
- id="0727-360B-API",
1228
- name="GLM-4.5",
1229
- owned_by="openai"
1230
- ),
1231
- tool_servers=[],
1232
- variables={
1233
- "{{USER_NAME}}": "User",
1234
- "{{USER_LOCATION}}": "Unknown",
1235
- "{{CURRENT_DATETIME}}": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
1236
- }
1237
- )
1238
-
1239
- # Get authentication token
1240
- auth_token = get_auth_token()
1241
-
1242
- # Check if tools are enabled and present
1243
- has_tools = (ServerConfig.TOOL_SUPPORT and
1244
- request.tools and
1245
- len(request.tools) > 0 and
1246
- request.tool_choice != "none")
1247
-
1248
- # Handle response based on stream flag
1249
- if request.stream:
1250
- handler = StreamResponseHandler(upstream_req, chat_id, auth_token, has_tools)
1251
- return StreamingResponse(
1252
- handler.handle(),
1253
- media_type="text/event-stream",
1254
- headers={
1255
- "Cache-Control": "no-cache",
1256
- "Connection": "keep-alive",
1257
- }
1258
- )
1259
- else:
1260
- handler = NonStreamResponseHandler(upstream_req, chat_id, auth_token, has_tools)
1261
- return handler.handle()
1262
-
1263
- except HTTPException:
1264
- raise
1265
- except Exception as e:
1266
- debug_log(f"处理请求时发生错误: {str(e)}")
1267
- import traceback
1268
- debug_log(f"错误堆栈: {traceback.format_exc()}")
1269
- raise HTTPException(status_code=500, detail=f"Internal server error: {str(e)}")
1270
-
1271
-
1272
- # ANTHROPIC API 兼容端点
1273
- @app.post("/v1/messages")
1274
- async def handle_anthropic_message(
1275
- req: AnthropicRequest,
1276
- x_api_key: str = Header(None, alias="x-api-key"),
1277
- authorization: str = Header(None, alias="authorization")
1278
- ):
1279
- """Handle Anthropic message requests"""
1280
- debug_log("收到 Anthropic message 请求")
1281
-
1282
- # 验证 API key
1283
- api_key = None
1284
- if x_api_key:
1285
- api_key = x_api_key
1286
- elif authorization and authorization.startswith("Bearer "):
1287
- api_key = authorization[7:]
1288
-
1289
- if not api_key or api_key != ServerConfig.ANTHROPIC_API_KEY:
1290
- debug_log(f"无效的 API key: {api_key}")
1291
- raise HTTPException(status_code=401, detail="Invalid API key")
1292
-
1293
- debug_log(f"API key 验证通过")
1294
- debug_log(f"请求解析成功 - 模型: {req.model}, 流式: {req.stream}, 消息数: {len(req.messages)}")
1295
-
1296
- # 确定上游模型和功能
1297
- upstream_model = "GLM-4.5"
1298
- if req.model == ServerConfig.THINKING_MODEL:
1299
- upstream_model = "GLM-4.5-Thinking"
1300
- elif req.model == ServerConfig.SEARCH_MODEL:
1301
- upstream_model = "GLM-4.5-Search"
1302
-
1303
- debug_log(f"收到请求 (模型: {req.model}) -> 代理到上游 (模型: {upstream_model})")
1304
-
1305
- # 生成 ID
1306
- chat_id, msg_id = generate_request_ids()
1307
-
1308
- # 转换消息格式
1309
- openai_messages = []
1310
- if req.system:
1311
- # 处理两种格式的 system 内容
1312
- if isinstance(req.system, str):
1313
- # 字符串格式
1314
- system_content = req.system
1315
- else:
1316
- # 对象数组格式
1317
- system_content = ""
1318
- for block in req.system:
1319
- if block.type == "text":
1320
- system_content += block.text
1321
-
1322
- openai_messages.append({"role": "system", "content": system_content})
1323
-
1324
- for msg in req.messages:
1325
- # 处理两种格式的内容
1326
- if isinstance(msg.content, str):
1327
- # 字符串格式
1328
- text_content = msg.content
1329
- else:
1330
- # 对象数组格式
1331
- text_content = ""
1332
- for block in msg.content:
1333
- if block.type == "text":
1334
- text_content += block.text
1335
-
1336
- openai_messages.append({
1337
- "role": msg.role,
1338
- "content": text_content
1339
- })
1340
-
1341
- # 构建上游请求
1342
- upstream_messages = []
1343
- for msg in openai_messages:
1344
- content = msg.get("content", "")
1345
- if content is None:
1346
- content = ""
1347
- upstream_messages.append(Message(
1348
- role=msg["role"],
1349
- content=content
1350
- ))
1351
-
1352
- upstream_req = UpstreamRequest(
1353
- stream=True, # 总是使用上游的流式
1354
- chat_id=chat_id,
1355
- id=msg_id,
1356
- model="0727-360B-API", # 实际的上游模型 ID
1357
- messages=upstream_messages,
1358
- params={},
1359
- features={"enable_thinking": True},
1360
- background_tasks={
1361
- "title_generation": False,
1362
- "tags_generation": False,
1363
- },
1364
- mcp_servers=[],
1365
- model_item=ModelItem(
1366
- id="0727-360B-API",
1367
- name="GLM-4.5",
1368
- owned_by="openai"
1369
- ),
1370
- tool_servers=[],
1371
- variables={
1372
- "{{USER_NAME}}": "User",
1373
- "{{USER_LOCATION}}": "Unknown",
1374
- "{{CURRENT_DATETIME}}": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
1375
- }
1376
- )
1377
-
1378
- # 获取认证 token
1379
- auth_token = get_auth_token()
1380
-
1381
- try:
1382
- # 调用上游 API
1383
- headers = get_browser_headers(chat_id)
1384
- headers["Authorization"] = f"Bearer {auth_token}"
1385
-
1386
- response = requests.post(
1387
- ServerConfig.API_ENDPOINT,
1388
- json=upstream_req.model_dump(exclude_none=True),
1389
- headers=headers,
1390
- timeout=60.0,
1391
- stream=True
1392
- )
1393
- response.raise_for_status()
1394
- except requests.HTTPError as e:
1395
- debug_log(f"上游 API 返回错误状态: {e.response.status_code}, 响应: {e.response.text}")
1396
- raise HTTPException(status_code=502, detail="Upstream API error")
1397
- except requests.RequestException as e:
1398
- debug_log(f"请求上游 API 失败: {e}")
1399
- raise HTTPException(status_code=502, detail=f"Failed to call upstream API: {e}")
1400
-
1401
- request_id = f"msg_{uuid.uuid4().hex}"
1402
-
1403
- if req.stream:
1404
- # 流式响应
1405
- return StreamingResponse(
1406
- stream_anthropic_generator(response, request_id, req.model),
1407
- media_type="text/event-stream",
1408
- headers={"Cache-Control": "no-cache", "Connection": "keep-alive"}
1409
- )
1410
- else:
1411
- # 非流式响应
1412
- full_content = ""
1413
- for line in response.iter_lines():
1414
- if not line.startswith(b"data:"): continue
1415
- data_str = line[5:].strip()
1416
- if not data_str: continue
1417
- try:
1418
- data = json.loads(data_str.decode('utf-8'))
1419
- delta_content = data.get("data", {}).get("delta_content", "")
1420
- phase = data.get("data", {}).get("phase", "")
1421
- if delta_content:
1422
- out_content = transform_thinking_content(delta_content) if phase == "thinking" else delta_content
1423
- if out_content: full_content += out_content
1424
- if data.get("data", {}).get("done", False) or phase == "done":
1425
- break
1426
- except json.JSONDecodeError:
1427
- continue
1428
-
1429
- return {
1430
- "id": request_id,
1431
- "type": "message",
1432
- "role": "assistant",
1433
- "model": req.model,
1434
- "content": [{"type": "text", "text": full_content}],
1435
- "stop_reason": "end_turn",
1436
- "usage": {"input_tokens": 0, "output_tokens": len(full_content) // 4}
1437
- }
1438
-
1439
-
1440
- # =============================================================================
1441
- # Main Entry Point
1442
- # =============================================================================
1443
-
1444
  if __name__ == "__main__":
1445
  import uvicorn
1446
- uvicorn.run("main:app", host="0.0.0.0", port=ServerConfig.LISTEN_PORT, reload=True)
 
 
 
1
  """
2
+ Main application entry point
 
 
 
 
3
  """
4
 
5
+ from fastapi import FastAPI, Request, Response
6
+ from fastapi.middleware.cors import CORSMiddleware
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
7
 
8
+ from app.core.config import settings
9
+ from app.api import openai, anthropic
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
 
11
+ # Create FastAPI app
12
  app = FastAPI(
13
  title="OpenAI Compatible API Server",
14
  description="An OpenAI-compatible API server for Z.AI chat service",
15
  version="1.0.0"
16
  )
17
 
18
+ # Add CORS middleware
19
+ app.add_middleware(
20
+ CORSMiddleware,
21
+ allow_origins=["*"],
22
+ allow_credentials=True,
23
+ allow_methods=["GET", "POST", "PUT", "DELETE", "OPTIONS"],
24
+ allow_headers=["Content-Type", "Authorization"],
25
+ )
26
 
27
+ # Include API routers
28
+ app.include_router(openai.router)
29
+ app.include_router(anthropic.router)
 
 
 
 
 
 
 
 
 
 
30
 
 
 
 
31
 
32
  @app.options("/")
33
  async def handle_options():
 
41
  return {"message": "OpenAI Compatible API Server"}
42
 
43
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
  if __name__ == "__main__":
45
  import uvicorn
46
+ uvicorn.run("main:app", host="0.0.0.0", port=settings.LISTEN_PORT, reload=True)
pyproject.toml CHANGED
@@ -5,7 +5,7 @@ build-backend = "hatchling.build"
5
  [project]
6
  name = "z-ai2api-python"
7
  version = "0.1.0"
8
- description = "一个为 Z.ai 提供 OpenAI API 兼容接口的 Python 代理服务"
9
  readme = "README.md"
10
  requires-python = ">=3.9,<=3.12"
11
  license = {text = "MIT"}
@@ -29,7 +29,10 @@ dependencies = [
29
  "fastapi==0.104.1",
30
  "uvicorn[standard]==0.24.0",
31
  "requests==2.32.5",
32
- "pydantic==2.5.0",
 
 
 
33
  ]
34
 
35
  [project.scripts]
 
5
  [project]
6
  name = "z-ai2api-python"
7
  version = "0.1.0"
8
+ description = "一个为 Z.ai 提供 OpenAI 和 Anthropic API 兼容接口的 Python 代理服务"
9
  readme = "README.md"
10
  requires-python = ">=3.9,<=3.12"
11
  license = {text = "MIT"}
 
29
  "fastapi==0.104.1",
30
  "uvicorn[standard]==0.24.0",
31
  "requests==2.32.5",
32
+ "pydantic==2.11.7",
33
+ "pydantic-settings==2.10.1",
34
+ "pydantic-core==2.33.2",
35
+ "typing-inspection==0.4.1"
36
  ]
37
 
38
  [project.scripts]
requirements.txt CHANGED
@@ -1,4 +1,7 @@
1
  fastapi==0.104.1
2
  uvicorn[standard]==0.24.0
3
  requests==2.32.5
4
- pydantic==2.5.0
 
 
 
 
1
  fastapi==0.104.1
2
  uvicorn[standard]==0.24.0
3
  requests==2.32.5
4
+ pydantic==2.11.7
5
+ pydantic-settings==2.10.1
6
+ pydantic-core==2.33.2
7
+ typing-inspection==0.4.1
tests/test_anthropic.py CHANGED
@@ -5,75 +5,59 @@ import requests
5
 
6
  # 服务器配置
7
  BASE_URL = "http://localhost:8080/v1/messages"
8
- API_KEY = "sk-your-api-key" # 修改为你的 API key
9
 
10
  test_data = {
11
  "model": "GLM-4.5",
12
- "messages": [
13
- {
14
- "role": "user",
15
- "content": "你好,这是一个测试"
16
- }
17
- ],
18
  "system": [
19
  {
20
  "type": "text",
21
  "text": "You are Claude Code, Anthropic's official CLI for Claude.",
22
- "cache_control": {
23
- "type": "ephemeral"
24
- }
25
  }
26
  ],
27
  "max_tokens": 1024,
28
  "stream": False,
29
  }
30
 
 
31
  def test_non_stream():
32
  """测试非流式请求"""
33
  print("=== 测试非流式请求 ===")
34
-
35
  try:
36
- response = requests.post(
37
- BASE_URL,
38
- headers={"x-api-key": API_KEY},
39
- json=test_data,
40
- timeout=30.0
41
- )
42
-
43
  print(f"状态码: {response.status_code}")
44
-
45
  if response.status_code == 200:
46
  result = response.json()
47
  print("响应成功!")
48
  print(f"ID: {result.get('id')}")
49
  print(f"模型: {result.get('model')}")
50
- if result.get('content'):
51
  print(f"内容: {result['content'][0]['text']}")
52
  else:
53
  print("错误响应:")
54
  print(response.text)
55
-
56
  except Exception as e:
57
  print(f"请求失败: {e}")
58
 
 
59
  def test_stream():
60
  """测试流式请求"""
61
  print("\n=== 测试流式请求 ===")
62
-
63
  stream_data = test_data.copy()
64
  stream_data["stream"] = True
65
-
66
  try:
67
- response = requests.post(
68
- BASE_URL,
69
- headers={"x-api-key": API_KEY},
70
- json=stream_data,
71
- stream=True,
72
- timeout=30.0
73
- )
74
-
75
  print(f"状态码: {response.status_code}")
76
-
77
  if response.status_code == 200:
78
  print("流式响应内容:")
79
  for line in response.iter_lines():
@@ -82,13 +66,14 @@ def test_stream():
82
  else:
83
  print("错误响应:")
84
  print(response.text)
85
-
86
  except Exception as e:
87
  print(f"请求失败: {e}")
88
 
 
89
  if __name__ == "__main__":
90
  try:
91
  test_non_stream()
92
  test_stream()
93
  except KeyboardInterrupt:
94
- print("\n测试已取消")
 
5
 
6
  # 服务器配置
7
  BASE_URL = "http://localhost:8080/v1/messages"
8
+ API_KEY = "sk-your-api-key"
9
 
10
  test_data = {
11
  "model": "GLM-4.5",
12
+ "messages": [{"role": "user", "content": "你好,这是一个测试"}],
 
 
 
 
 
13
  "system": [
14
  {
15
  "type": "text",
16
  "text": "You are Claude Code, Anthropic's official CLI for Claude.",
17
+ "cache_control": {"type": "ephemeral"},
 
 
18
  }
19
  ],
20
  "max_tokens": 1024,
21
  "stream": False,
22
  }
23
 
24
+
25
  def test_non_stream():
26
  """测试非流式请求"""
27
  print("=== 测试非流式请求 ===")
28
+
29
  try:
30
+ response = requests.post(BASE_URL, headers={"x-api-key": API_KEY}, json=test_data, timeout=30.0)
31
+
 
 
 
 
 
32
  print(f"状态码: {response.status_code}")
33
+
34
  if response.status_code == 200:
35
  result = response.json()
36
  print("响应成功!")
37
  print(f"ID: {result.get('id')}")
38
  print(f"模型: {result.get('model')}")
39
+ if result.get("content"):
40
  print(f"内容: {result['content'][0]['text']}")
41
  else:
42
  print("错误响应:")
43
  print(response.text)
44
+
45
  except Exception as e:
46
  print(f"请求失败: {e}")
47
 
48
+
49
  def test_stream():
50
  """测试流式请求"""
51
  print("\n=== 测试流式请求 ===")
52
+
53
  stream_data = test_data.copy()
54
  stream_data["stream"] = True
55
+
56
  try:
57
+ response = requests.post(BASE_URL, headers={"x-api-key": API_KEY}, json=stream_data, stream=True, timeout=30.0)
58
+
 
 
 
 
 
 
59
  print(f"状态码: {response.status_code}")
60
+
61
  if response.status_code == 200:
62
  print("流式响应内容:")
63
  for line in response.iter_lines():
 
66
  else:
67
  print("错误响应:")
68
  print(response.text)
69
+
70
  except Exception as e:
71
  print(f"请求失败: {e}")
72
 
73
+
74
  if __name__ == "__main__":
75
  try:
76
  test_non_stream()
77
  test_stream()
78
  except KeyboardInterrupt:
79
+ print("\n测试已取消")
tests/{test_weather.py → test_function_call.py} RENAMED
File without changes