ZyphrZero commited on
Commit
f0cacfe
·
1 Parent(s): 5ae48ef

Initial commit

Browse files
Files changed (10) hide show
  1. LICENSE +1 -1
  2. README.md +143 -0
  3. deploy/.dockerignore +3 -0
  4. deploy/DOCKER.md +153 -0
  5. deploy/Dockerfile +53 -0
  6. deploy/docker-compose.yml +49 -0
  7. main.py +630 -0
  8. pyproject.toml +63 -0
  9. requirements.txt +4 -0
  10. uv.lock +0 -0
LICENSE CHANGED
@@ -1,6 +1,6 @@
1
  MIT License
2
 
3
- Copyright (c) 2025 CassianVale
4
 
5
  Permission is hereby granted, free of charge, to any person obtaining a copy
6
  of this software and associated documentation files (the "Software"), to deal
 
1
  MIT License
2
 
3
+ Copyright (c) 2025 ZyphrZero
4
 
5
  Permission is hereby granted, free of charge, to any person obtaining a copy
6
  of this software and associated documentation files (the "Software"), to deal
README.md ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ## 项目简介
3
+
4
+ 这是一个为 Z.ai 提供 OpenAI API 兼容接口的 Python 代理服务,允许开发者通过标准的 OpenAI API 格式访问 Z.ai 的 GLM-4.5 模型。
5
+
6
+ ## 主要特性
7
+
8
+ - **OpenAI API 兼容**:完整支持 `/v1/chat/completions` 和 `/v1/models` 端点
9
+ - **流式响应支持**:完整实现 Server-Sent Events (SSE) 流式传输
10
+ - **思考内容处理**:提供多种策略处理模型的思考过程(`<details>` 标签)
11
+ - **匿名会话支持**:可选使用匿名 token 避免共享对话历史
12
+ - **多种模型支持**:支持 GLM-4.5 基础版、思考版和搜索版
13
+ - **调试模式**:详细的请求/响应日志记录,便于开发调试
14
+ - **CORS 支持**:内置跨域资源共享支持
15
+ - **异步处理**:基于 FastAPI 和 httpx 的高性能异步架构
16
+
17
+ ## 使用场景
18
+
19
+ - 将 Z.ai 集成到支持 OpenAI API 的应用程序中
20
+ - 开发需要同时使用多个 AI 服务的应用
21
+ - 测试和评估 GLM-4.5 模型的能力
22
+ - 需要流式响应或思考内容的 AI 应用开发
23
+
24
+ ## 快速开始
25
+
26
+ ### 使用 uv (推荐)
27
+
28
+ 1. 安装 uv:
29
+ ```bash
30
+ # macOS/Linux
31
+ curl -LsSf https://astral.sh/uv/install.sh | sh
32
+ # Windows (PowerShell)
33
+ powershell -c "irm https://astral.sh/uv/install.sh | iex"
34
+ ```
35
+
36
+ 2. 同步依赖:
37
+ ```bash
38
+ uv sync
39
+ ```
40
+
41
+ 3. 运行服务:
42
+ ```bash
43
+ uv run python main.py
44
+ ```
45
+
46
+ ### 使用 pip
47
+
48
+ 1. 安装依赖:
49
+ ```bash
50
+ pip install -r requirements.txt
51
+ ```
52
+
53
+ 2. 配置服务(可选):
54
+ 编辑 `main.py` 中的以下常量以调整服务行为:
55
+ - `DEFAULT_KEY`: 客户端 API 密钥
56
+ - `UPSTREAM_URL`: Z.ai 上游 API 地址
57
+ - `UPSTREAM_TOKEN`: 固定认证 token(匿名模式失败时使用)
58
+ - `PORT`: 服务监听端口
59
+ - `DEBUG_MODE`: 调试模式开关
60
+ - `THINK_TAGS_MODE`: 思考内容处理策略
61
+ - `ANON_TOKEN_ENABLED`: 匿名 token 开关
62
+
63
+ 3. 运行服务:
64
+ ```bash
65
+ python main.py
66
+ ```
67
+
68
+ 服务启动后,可以访问 http://localhost:8080/docs 查看自动生成的 Swagger API 文档
69
+
70
+ 4. 使用 OpenAI 客户端库调用:
71
+ ```python
72
+ import openai
73
+
74
+ # 初始化客户端
75
+ client = openai.OpenAI(
76
+ base_url="http://localhost:8080/v1",
77
+ api_key="sk-tbkFoKzk9a531YyUNNF5" # 使用配置的 DEFAULT_KEY
78
+ )
79
+
80
+ # 流式调用示例
81
+ response = client.chat.completions.create(
82
+ model="GLM-4.5", # 可选: "GLM-4.5-Thinking", "GLM-4.5-Search"
83
+ messages=[{"role": "user", "content": "你好"}],
84
+ stream=True
85
+ )
86
+
87
+ for chunk in response:
88
+ content = chunk.choices[0].delta.content
89
+ reasoning = chunk.choices[0].delta.reasoning_content
90
+ if content:
91
+ print(content, end="")
92
+ if reasoning:
93
+ print(f"\n[思考] {reasoning}\n")
94
+ ```
95
+
96
+ 注意:请将 `api_key` 替换为您在 `main.py` 中配置的 `DEFAULT_KEY` 值。
97
+
98
+ ## 配置选项
99
+
100
+ | 配置项 | 描述 | 默认值 |
101
+ |--------|------|--------|
102
+ | `UPSTREAM_URL` | Z.ai 的上游 API 地址 | `https://chat.z.ai/api/chat/completions` |
103
+ | `DEFAULT_KEY` | 下游客户端鉴权 key | `sk-tbkFoKzk9a531YyUNNF5` |
104
+ | `UPSTREAM_TOKEN` | 上游 API 的 token (匿名模式失败时使用) | JWT token |
105
+ | `DEFAULT_MODEL_NAME` | 默认模型名称 | `GLM-4.5` |
106
+ | `THINKING_MODEL_NAME` | 思考模型名称 | `GLM-4.5-Thinking` |
107
+ | `SEARCH_MODEL_NAME` | 搜索模型名称 | `GLM-4.5-Search` |
108
+ | `PORT` | 服务监听端口 | `8080` |
109
+ | `DEBUG_MODE` | 调试模式开关 | `true` |
110
+ | `THINK_TAGS_MODE` | 思考内容处理策略 | `think` (可选: `strip`, `raw`) |
111
+ | `ANON_TOKEN_ENABLED` | 是否使用匿名 token | `true` |
112
+
113
+ ### 思考内容处理策略说明
114
+
115
+ - **think**: 将 `<details>` 标签转换为 `<thinking>` 标签,适合 OpenAI 兼容格式
116
+ - **strip**: 完全移除 `<details>` 标签及其内容
117
+ - **raw**: 保留原始格式,不做任何处理
118
+
119
+ ## 架构说明
120
+
121
+ 本项目采用以下技术栈:
122
+
123
+ - **FastAPI**: 现代、快速的 Web 框架,提供自动 API 文档生成
124
+ - **httpx**: 异步 HTTP 客户端,用于上游 API 调用
125
+ - **Pydantic**: 数据验证和序列化,确保 API 兼容性
126
+ - **uvicorn**: ASGI 服务器,提供高性能服务
127
+
128
+ 项目通过异步编程模型实现高效的并发处理,支持流式和非流式两种响应模式。
129
+
130
+ ## 贡献指南
131
+
132
+ 欢迎提交 Issue 和 Pull Request!请确保:
133
+ 1. 遵循 PEP 8 规范
134
+ 2. 提交前运行测试(如果有)
135
+ 3. 更新相关文档
136
+
137
+ ## 许可证
138
+
139
+ MIT LICENSE
140
+
141
+ ## 免责声明
142
+
143
+ 本项目与 Z.ai 官方无关,使用前请确保遵守 Z.ai 的服务条款。请勿将此服务用于商业用途或违反 Z.ai 使用条款的场景。
deploy/.dockerignore ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ .git
2
+ README.md
3
+ *.log
deploy/DOCKER.md ADDED
@@ -0,0 +1,153 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Docker部署指南
2
+
3
+ ## 文件说明
4
+
5
+ - `Dockerfile.python` - 基础版本的Dockerfile
6
+ - `Dockerfile.python.optimized` - 多阶段构建,镜像更小
7
+ - `docker-compose.yml` - Docker Compose配置文件
8
+ - `.dockerignore` - Docker构建时忽略的文件
9
+ - `test-page/` - 简单的Web测试界面
10
+
11
+ ## 快速开始
12
+
13
+ ### 1. 构建并运行(使用docker-compose)
14
+
15
+ ```bash
16
+ # 启动服务
17
+ docker-compose up -d
18
+
19
+ # 查看日志
20
+ docker-compose logs -f
21
+
22
+ # 停止服务
23
+ docker-compose down
24
+ ```
25
+
26
+ ### 2. 仅使用Docker
27
+
28
+ ```bash
29
+ # 构建镜像
30
+ docker build -f Dockerfile.python.optimized -t openai-proxy-python .
31
+
32
+ # 运行容器
33
+ docker run -d \
34
+ --name openai-proxy \
35
+ -p 8080:8080 \
36
+ openai-proxy-python
37
+ ```
38
+
39
+ ### 3. 带测试界面的完整部署
40
+
41
+ ```bash
42
+ # 启动服务和测试界面
43
+ docker-compose --profile test-ui up -d
44
+
45
+ # 访问测试界面
46
+ # 打开浏览器访问 http://localhost:8081
47
+ ```
48
+
49
+ ## 环境变量配置
50
+
51
+ 可以通过环境变量覆盖默认配置:
52
+
53
+ ```bash
54
+ # 在docker-compose.yml中添加
55
+ environment:
56
+ - DEBUG_MODE=false
57
+ - PORT=8080
58
+ - DEFAULT_KEY=your-api-key
59
+ - UPSTREAM_TOKEN=your-upstream-token
60
+ ```
61
+
62
+ 或者使用.env文件:
63
+
64
+ ```bash
65
+ # 创建.env文件
66
+ echo "DEBUG_MODE=false" > .env
67
+ echo "DEFAULT_KEY=sk-your-custom-key" >> .env
68
+
69
+ # 启动时自动加载
70
+ docker-compose up -d
71
+ ```
72
+
73
+ ## 生产环境建议
74
+
75
+ 1. **使用优化版Dockerfile**
76
+ ```bash
77
+ docker build -f Dockerfile.python.optimized -t openai-proxy:latest .
78
+ ```
79
+
80
+ 2. **配置HTTPS**
81
+ 建议在反向代理(如Nginx)中配置SSL证书
82
+
83
+ 3. **使用Docker Secrets管理敏感信息**
84
+ ```yaml
85
+ secrets:
86
+ api_key:
87
+ file: ./secrets/api_key.txt
88
+ upstream_token:
89
+ file: ./secrets/upstream_token.txt
90
+ ```
91
+
92
+ 4. **设置资源限制**
93
+ ```yaml
94
+ deploy:
95
+ resources:
96
+ limits:
97
+ cpus: '0.5'
98
+ memory: 512M
99
+ ```
100
+
101
+ ## 常用命令
102
+
103
+ ```bash
104
+ # 查看容器状态
105
+ docker ps
106
+
107
+ # 查看日志
108
+ docker logs openai-proxy
109
+
110
+ # 进入容器
111
+ docker exec -it openai-proxy bash
112
+
113
+ # 重新构建
114
+ docker-compose build
115
+
116
+ # 完全清理
117
+ docker-compose down -v --rmi all
118
+ ```
119
+
120
+ ## 故障排除
121
+
122
+ 1. **端口冲突**
123
+ - 修改docker-compose.yml中的端口映射
124
+ - 或者停止占用8080端口的程序
125
+
126
+ 2. **镜像构建失败**
127
+ - 确保Docker版本 >= 19.03
128
+ - 检查网络连接
129
+
130
+ 3. **容器启动失败**
131
+ - 查看日志:`docker logs openai-proxy`
132
+ - 检查配置文件语法
133
+
134
+ 4. **API请求失败**
135
+ - 确认容器正在运行
136
+ - 检查防火墙设置
137
+ - 验证API密钥配置
138
+
139
+ ## 测试API
140
+
141
+ 容器启动后,可以测试API:
142
+
143
+ ```bash
144
+ # 测试模型列表
145
+ curl -X GET http://localhost:8080/v1/models \
146
+ -H "Authorization: Bearer sk-tbkFoKzk9a531YyUNNF5"
147
+
148
+ # 测试聊天接口
149
+ curl -X POST http://localhost:8080/v1/chat/completions \
150
+ -H "Content-Type: application/json" \
151
+ -H "Authorization: Bearer sk-tbkFoKzk9a531YyUNNF5" \
152
+ -d '{"model": "GLM-4.5", "messages": [{"role": "user", "content": "Hello"}]}'
153
+ ```
deploy/Dockerfile ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 多阶段构建 - 构建阶段
2
+ FROM python:3.11-slim as builder
3
+
4
+ # 安装构建依赖
5
+ RUN apt-get update && apt-get install -y \
6
+ gcc \
7
+ curl \
8
+ && rm -rf /var/lib/apt/lists/*
9
+
10
+ # 设置虚拟环境
11
+ RUN python -m venv /opt/venv
12
+ ENV PATH="/opt/venv/bin:$PATH"
13
+
14
+ # 复制并安装依赖
15
+ COPY requirements.txt .
16
+ RUN pip install --no-cache-dir --upgrade pip && \
17
+ pip install --no-cache-dir -r requirements.txt
18
+
19
+ # 运行阶段 - 更小的镜像
20
+ FROM python:3.11-slim
21
+
22
+ # 安装运行时依赖(curl用于健康检查)
23
+ RUN apt-get update && apt-get install -y \
24
+ curl \
25
+ && rm -rf /var/lib/apt/lists/* && \
26
+ groupadd -r app && useradd -r -g app app
27
+
28
+ # 从构建阶段复制虚拟环境
29
+ COPY --from=builder /opt/venv /opt/venv
30
+
31
+ # 设置环境变量
32
+ ENV PYTHONDONTWRITEBYTECODE=1
33
+ ENV PYTHONUNBUFFERED=1
34
+ ENV PATH="/opt/venv/bin:$PATH"
35
+
36
+ # 创建工作目录并设置权限
37
+ WORKDIR /app
38
+ RUN chown app:app /app
39
+ USER app
40
+
41
+ # 复制应用代码
42
+ COPY --chown=app:app main.py .
43
+ COPY --chown=app:app test_api.py .
44
+
45
+ # 暴露端口
46
+ EXPOSE 8080
47
+
48
+ # 健康检查
49
+ HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
50
+ CMD curl -f http://localhost:8080/v1/models || exit 1
51
+
52
+ # 启动命令
53
+ CMD ["python", "main.py"]
deploy/docker-compose.yml ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ version: '3.8'
2
+
3
+ services:
4
+ openai-proxy:
5
+ build:
6
+ context: .
7
+ dockerfile: Dockerfile.python.optimized
8
+ container_name: openai-proxy-python
9
+ ports:
10
+ - "8080:8080"
11
+ environment:
12
+ # 可以通过环境变量覆盖配置
13
+ - DEBUG_MODE=false
14
+ - PORT=8080
15
+ # 注意:敏感信息应该使用 secrets 或 env 文件
16
+ restart: unless-stopped
17
+ healthcheck:
18
+ test: ["CMD", "curl", "-f", "http://localhost:8080/v1/models"]
19
+ interval: 30s
20
+ timeout: 10s
21
+ retries: 3
22
+ start_period: 40s
23
+ networks:
24
+ - proxy-network
25
+
26
+ # 可选:添加一个简单的web界面用于测试
27
+ web-test:
28
+ image: nginx:alpine
29
+ container_name: proxy-web-test
30
+ ports:
31
+ - "8081:80"
32
+ volumes:
33
+ - ./test-page:/usr/share/nginx/html:ro
34
+ depends_on:
35
+ - openai-proxy
36
+ networks:
37
+ - proxy-network
38
+ profiles:
39
+ - test-ui
40
+
41
+ networks:
42
+ proxy-network:
43
+ driver: bridge
44
+
45
+ # 使用说明:
46
+ # 1. 基本启动:docker-compose up -d
47
+ # 2. 带测试界面:docker-compose --profile test-ui up -d
48
+ # 3. 查看日志:docker-compose logs -f
49
+ # 4. 停止服务:docker-compose down
main.py ADDED
@@ -0,0 +1,630 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Go到Python代码转换说明
3
+ =====================
4
+
5
+ 这是一个将Go语言实现的OpenAI兼容API代理服务器转换为Python版本的代码。
6
+ 使用FastAPI作为Web框架,httpx用于HTTP请求,uvicorn作为ASGI服务器。
7
+
8
+ 主要功能对应关系:
9
+ 1. 配置常量:使用Python模块级常量替代Go的const
10
+ 2. 数据结构:使用Pydantic模型替代Go的struct
11
+ 3. HTTP处理:使用FastAPI路由替代Go的http.HandleFunc
12
+ 4. 流式响应:使用FastAPI的StreamingResponse替代Go的http.Flusher
13
+ 5. SSE处理:使用生成器函数和字符串格式化替代Go的fmt.Fprintf
14
+
15
+ 关键实现思路:
16
+ - 保持了原有的API认证逻辑
17
+ - 维持了上游API调用的头部伪装
18
+ - 实现了相同的思考内容处理策略
19
+ - 保持了流式和非流式响应的处理逻辑
20
+
21
+ 依赖安装:
22
+ pip install fastapi uvicorn httpx pydantic
23
+
24
+ 运行方式:
25
+ uvicorn main:app --host 0.0.0.0 --port 8080 --reload
26
+ """
27
+
28
+ import json
29
+ import re
30
+ import time
31
+ from datetime import datetime
32
+ from typing import Dict, List, Optional, Any, Union, AsyncGenerator
33
+ from urllib.parse import urljoin
34
+
35
+ import httpx
36
+ from fastapi import FastAPI, Request, Response, HTTPException, Header
37
+ from fastapi.responses import StreamingResponse, JSONResponse
38
+ from pydantic import BaseModel, Field
39
+
40
+
41
+ # 配置常量
42
+ UPSTREAM_URL = "https://chat.z.ai/api/chat/completions"
43
+ DEFAULT_KEY = "sk-tbkFoKzk9a531YyUNNF5"
44
+ UPSTREAM_TOKEN = "eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCJ9.eyJpZCI6IjMxNmJjYjQ4LWZmMmYtNGExNS04NTNkLWYyYTI5YjY3ZmYwZiIsImVtYWlsIjoiR3Vlc3QtMTc1NTg0ODU4ODc4OEBndWVzdC5jb20ifQ.PktllDySS3trlyuFpTeIZf-7hl8Qu1qYF3BxjgIul0BrNux2nX9hVzIjthLXKMWAf9V0qM8Vm_iyDqkjPGsaiQ"
45
+ DEFAULT_MODEL_NAME = "GLM-4.5"
46
+ THINKING_MODEL_NAME = "GLM-4.5-Thinking"
47
+ SEARCH_MODEL_NAME = "GLM-4.5-Search"
48
+ PORT = 8080
49
+ DEBUG_MODE = True
50
+
51
+ # 思考内容处理策略
52
+ THINK_TAGS_MODE = "think" # strip: 去除<details>标签;think: 转为<think>标签;raw: 保留原样
53
+
54
+ # 伪装前端头部
55
+ X_FE_VERSION = "prod-fe-1.0.70"
56
+ BROWSER_UA = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/139.0.0.0 Safari/537.36 Edg/139.0.0.0"
57
+ SEC_CH_UA = '"Not;A=Brand";v="99", "Microsoft Edge";v="139", "Chromium";v="139"'
58
+ SEC_CH_UA_MOB = "?0"
59
+ SEC_CH_UA_PLAT = '"Windows"'
60
+ ORIGIN_BASE = "https://chat.z.ai"
61
+
62
+ # 匿名token开关
63
+ ANON_TOKEN_ENABLED = True
64
+
65
+
66
+ # 数据结构定义
67
+ class Message(BaseModel):
68
+ role: str
69
+ content: str
70
+ reasoning_content: Optional[str] = None
71
+
72
+
73
+ class OpenAIRequest(BaseModel):
74
+ model: str
75
+ messages: List[Message]
76
+ stream: Optional[bool] = False
77
+ temperature: Optional[float] = None
78
+ max_tokens: Optional[int] = None
79
+
80
+
81
+ class ModelItem(BaseModel):
82
+ id: str
83
+ name: str
84
+ owned_by: str
85
+
86
+
87
+ class UpstreamRequest(BaseModel):
88
+ stream: bool
89
+ model: str
90
+ messages: List[Message]
91
+ params: Dict[str, Any] = {}
92
+ features: Dict[str, Any] = {}
93
+ background_tasks: Optional[Dict[str, bool]] = None
94
+ chat_id: Optional[str] = None
95
+ id: Optional[str] = None
96
+ mcp_servers: Optional[List[str]] = None
97
+ model_item: Optional[ModelItem] = None
98
+ tool_servers: Optional[List[str]] = None
99
+ variables: Optional[Dict[str, str]] = None
100
+ model_config = {'protected_namespaces': ()}
101
+
102
+
103
+ class Delta(BaseModel):
104
+ role: Optional[str] = None
105
+ content: Optional[str] = None
106
+ reasoning_content: Optional[str] = None
107
+
108
+
109
+ class Choice(BaseModel):
110
+ index: int
111
+ message: Optional[Message] = None
112
+ delta: Optional[Delta] = None
113
+ finish_reason: Optional[str] = None
114
+
115
+
116
+ class Usage(BaseModel):
117
+ prompt_tokens: int = 0
118
+ completion_tokens: int = 0
119
+ total_tokens: int = 0
120
+
121
+
122
+ class OpenAIResponse(BaseModel):
123
+ id: str
124
+ object: str
125
+ created: int
126
+ model: str
127
+ choices: List[Choice]
128
+ usage: Optional[Usage] = None
129
+
130
+
131
+ class UpstreamError(BaseModel):
132
+ detail: str
133
+ code: int
134
+
135
+
136
+ class UpstreamDataInner(BaseModel):
137
+ error: Optional[UpstreamError] = None
138
+
139
+
140
+ class UpstreamDataData(BaseModel):
141
+ delta_content: str = ""
142
+ edit_content: str = ""
143
+ phase: str = ""
144
+ done: bool = False
145
+ usage: Optional[Usage] = None
146
+ error: Optional[UpstreamError] = None
147
+ inner: Optional[UpstreamDataInner] = None
148
+
149
+
150
+ class UpstreamData(BaseModel):
151
+ type: str
152
+ data: UpstreamDataData
153
+ error: Optional[UpstreamError] = None
154
+
155
+
156
+ class Model(BaseModel):
157
+ id: str
158
+ object: str = "model"
159
+ created: int
160
+ owned_by: str
161
+
162
+
163
+ class ModelsResponse(BaseModel):
164
+ object: str = "list"
165
+ data: List[Model]
166
+
167
+
168
+ # FastAPI应用
169
+ app = FastAPI()
170
+
171
+
172
+ # 调试日志函数
173
+ def debug_log(format_str: str, *args):
174
+ if DEBUG_MODE:
175
+ print(f"[DEBUG] {format_str % args}")
176
+
177
+
178
+ # 获取匿名token
179
+ async def get_anonymous_token() -> str:
180
+ """获取匿名token(每次对话使用不同token,避免共享记忆)"""
181
+ async with httpx.AsyncClient(timeout=10.0) as client:
182
+ headers = {
183
+ "User-Agent": BROWSER_UA,
184
+ "Accept": "*/*",
185
+ "Accept-Language": "zh-CN,zh;q=0.9",
186
+ "X-FE-Version": X_FE_VERSION,
187
+ "sec-ch-ua": SEC_CH_UA,
188
+ "sec-ch-ua-mobile": SEC_CH_UA_MOB,
189
+ "sec-ch-ua-platform": SEC_CH_UA_PLAT,
190
+ "Origin": ORIGIN_BASE,
191
+ "Referer": f"{ORIGIN_BASE}/",
192
+ }
193
+
194
+ response = await client.get(f"{ORIGIN_BASE}/api/v1/auths/", headers=headers)
195
+
196
+ if response.status_code != 200:
197
+ raise Exception(f"anon token status={response.status_code}")
198
+
199
+ data = response.json()
200
+ token = data.get("token")
201
+ if not token:
202
+ raise Exception("anon token empty")
203
+
204
+ return token
205
+
206
+
207
+ # CORS中间件
208
+ @app.middleware("http")
209
+ async def add_cors_headers(request: Request, call_next):
210
+ response = await call_next(request)
211
+ response.headers["Access-Control-Allow-Origin"] = "*"
212
+ response.headers["Access-Control-Allow-Methods"] = "GET, POST, PUT, DELETE, OPTIONS"
213
+ response.headers["Access-Control-Allow-Headers"] = "Content-Type, Authorization"
214
+ response.headers["Access-Control-Allow-Credentials"] = "true"
215
+ return response
216
+
217
+
218
+ # OPTIONS处理器
219
+ @app.options("/")
220
+ async def handle_options():
221
+ return Response(status_code=200)
222
+
223
+
224
+ # 模型列表接口
225
+ @app.get("/v1/models")
226
+ async def handle_models():
227
+ response = ModelsResponse(
228
+ data=[
229
+ Model(
230
+ id=DEFAULT_MODEL_NAME,
231
+ created=int(time.time()),
232
+ owned_by="z.ai"
233
+ ),
234
+ Model(
235
+ id=THINKING_MODEL_NAME,
236
+ created=int(time.time()),
237
+ owned_by="z.ai"
238
+ ),
239
+ Model(
240
+ id=SEARCH_MODEL_NAME,
241
+ created=int(time.time()),
242
+ owned_by="z.ai"
243
+ ),
244
+ ]
245
+ )
246
+ return response
247
+
248
+
249
+ # 聊天完成接口
250
+ @app.post("/v1/chat/completions")
251
+ async def handle_chat_completions(
252
+ request: OpenAIRequest,
253
+ authorization: str = Header(...)
254
+ ):
255
+ debug_log("收到chat completions请求")
256
+
257
+ # 验证API Key
258
+ if not authorization.startswith("Bearer "):
259
+ debug_log("缺少或无效的Authorization头")
260
+ raise HTTPException(status_code=401, detail="Missing or invalid Authorization header")
261
+
262
+ api_key = authorization[7:] # 去掉"Bearer "
263
+ if api_key != DEFAULT_KEY:
264
+ debug_log(f"无效的API key: {api_key}")
265
+ raise HTTPException(status_code=401, detail="Invalid API key")
266
+
267
+ debug_log("API key验证通过")
268
+ debug_log(f"请求解析成功 - 模型: {request.model}, 流式: {request.stream}, 消息数: {len(request.messages)}")
269
+
270
+ # 生成会话相关ID
271
+ chat_id = f"{int(time.time() * 1000)}-{int(time.time())}"
272
+ msg_id = str(int(time.time() * 1000000))
273
+
274
+ # 确定模型特性
275
+ is_thinking = request.model == THINKING_MODEL_NAME
276
+ is_search = request.model == SEARCH_MODEL_NAME
277
+ search_mcp = "deep-web-search" if is_search else ""
278
+
279
+ # 构造上游请求
280
+ upstream_req = UpstreamRequest(
281
+ stream=True, # 总是使用流式从上游获取
282
+ chat_id=chat_id,
283
+ id=msg_id,
284
+ model="0727-360B-API", # 上游实际模型ID
285
+ messages=request.messages,
286
+ params={},
287
+ features={
288
+ "enable_thinking": is_thinking,
289
+ "web_search": is_search,
290
+ "auto_web_search": is_search,
291
+ },
292
+ background_tasks={
293
+ "title_generation": False,
294
+ "tags_generation": False,
295
+ },
296
+ mcp_servers=[search_mcp] if search_mcp else [],
297
+ model_item=ModelItem(
298
+ id="0727-360B-API",
299
+ name="GLM-4.5",
300
+ owned_by="openai"
301
+ ),
302
+ tool_servers=[],
303
+ variables={
304
+ "{{USER_NAME}}": "User",
305
+ "{{USER_LOCATION}}": "Unknown",
306
+ "{{CURRENT_DATETIME}}": datetime.now().strftime("%Y-%m-%d %H:%M:%S"),
307
+ }
308
+ )
309
+
310
+ # 选择本次对话使用的token
311
+ auth_token = UPSTREAM_TOKEN
312
+ if ANON_TOKEN_ENABLED:
313
+ try:
314
+ token = await get_anonymous_token()
315
+ auth_token = token
316
+ debug_log(f"匿名token获取成功: {token[:10]}...")
317
+ except Exception as e:
318
+ debug_log(f"匿名token获取失败,回退固定token: {e}")
319
+
320
+ # 调用上游API
321
+ if request.stream:
322
+ return StreamingResponse(
323
+ handle_stream_response(upstream_req, chat_id, auth_token),
324
+ media_type="text/event-stream",
325
+ headers={
326
+ "Cache-Control": "no-cache",
327
+ "Connection": "keep-alive",
328
+ }
329
+ )
330
+ else:
331
+ return await handle_non_stream_response(upstream_req, chat_id, auth_token)
332
+
333
+
334
+ async def call_upstream_with_headers(upstream_req: UpstreamRequest, referer_chat_id: str, auth_token: str) -> httpx.Response:
335
+ """调用上游API"""
336
+ headers = {
337
+ "Content-Type": "application/json",
338
+ "Accept": "application/json, text/event-stream",
339
+ "User-Agent": BROWSER_UA,
340
+ "Authorization": f"Bearer {auth_token}",
341
+ "Accept-Language": "zh-CN",
342
+ "sec-ch-ua": SEC_CH_UA,
343
+ "sec-ch-ua-mobile": SEC_CH_UA_MOB,
344
+ "sec-ch-ua-platform": SEC_CH_UA_PLAT,
345
+ "X-FE-Version": X_FE_VERSION,
346
+ "Origin": ORIGIN_BASE,
347
+ "Referer": f"{ORIGIN_BASE}/c/{referer_chat_id}",
348
+ }
349
+
350
+ debug_log(f"调用上游API: {UPSTREAM_URL}")
351
+ debug_log(f"上游请求体: {upstream_req.model_dump_json()}")
352
+
353
+ async with httpx.AsyncClient(timeout=60.0) as client:
354
+ response = await client.post(
355
+ UPSTREAM_URL,
356
+ json=upstream_req.model_dump(exclude_none=True),
357
+ headers=headers
358
+ )
359
+
360
+ debug_log(f"上游响应状态: {response.status_code}")
361
+ return response
362
+
363
+
364
+ def transform_thinking(s: str) -> str:
365
+ """转换思考内容"""
366
+ # 去 <summary>…</summary>
367
+ s = re.sub(r'(?s)<summary>.*?</summary>', '', s)
368
+ # 清理残留自定义标签
369
+ s = s.replace("</thinking>", "").replace("<Full>", "").replace("</Full>", "")
370
+ s = s.strip()
371
+
372
+ if THINK_TAGS_MODE == "think":
373
+ s = re.sub(r'<details[^>]*>', '<think>', s)
374
+ s = s.replace("</details>", "</think>")
375
+ elif THINK_TAGS_MODE == "strip":
376
+ s = re.sub(r'<details[^>]*>', '', s)
377
+ s = s.replace("</details>", "")
378
+
379
+ # 处理每行前缀 "> "
380
+ s = s.lstrip("> ")
381
+ s = s.replace("\n> ", "\n")
382
+ return s.strip()
383
+
384
+
385
+ async def handle_stream_response(upstream_req: UpstreamRequest, chat_id: str, auth_token: str) -> AsyncGenerator[str, None]:
386
+ """处理流式响应"""
387
+ debug_log(f"开始处理流式响应 (chat_id={chat_id})")
388
+
389
+ try:
390
+ response = await call_upstream_with_headers(upstream_req, chat_id, auth_token)
391
+ except Exception as e:
392
+ debug_log(f"调用上游失败: {e}")
393
+ yield "data: {\"error\": \"Failed to call upstream\"}\n\n"
394
+ return
395
+
396
+ if response.status_code != 200:
397
+ debug_log(f"上游返回错误状态: {response.status_code}")
398
+ if DEBUG_MODE:
399
+ debug_log(f"上游错误响应: {response.text}")
400
+ yield "data: {\"error\": \"Upstream error\"}\n\n"
401
+ return
402
+
403
+ # 发送第一个chunk(role)
404
+ first_chunk = OpenAIResponse(
405
+ id=f"chatcmpl-{int(time.time())}",
406
+ object="chat.completion.chunk",
407
+ created=int(time.time()),
408
+ model=DEFAULT_MODEL_NAME,
409
+ choices=[Choice(
410
+ index=0,
411
+ delta=Delta(role="assistant")
412
+ )]
413
+ )
414
+ yield f"data: {first_chunk.model_dump_json()}\n\n"
415
+
416
+ # 读取上游SSE流
417
+ debug_log("开始读取上游SSE流")
418
+ line_count = 0
419
+ sent_initial_answer = False
420
+
421
+ async for line in response.aiter_lines():
422
+ line_count += 1
423
+
424
+ if not line.startswith("data: "):
425
+ continue
426
+
427
+ data_str = line[6:] # 去掉 "data: "
428
+ if not data_str:
429
+ continue
430
+
431
+ debug_log(f"收到SSE数据 (第{line_count}行): {data_str}")
432
+
433
+ try:
434
+ upstream_data = UpstreamData.model_validate_json(data_str)
435
+ except Exception as e:
436
+ debug_log(f"SSE数据解析失败: {e}")
437
+ continue
438
+
439
+ # 错误检测
440
+ if (upstream_data.error or
441
+ upstream_data.data.error or
442
+ (upstream_data.data.inner and upstream_data.data.inner.error)):
443
+
444
+ err_obj = upstream_data.error or upstream_data.data.error
445
+ if not err_obj and upstream_data.data.inner:
446
+ err_obj = upstream_data.data.inner.error
447
+
448
+ debug_log(f"上游错误: code={err_obj.code}, detail={err_obj.detail}")
449
+
450
+ # 结束下游流
451
+ end_chunk = OpenAIResponse(
452
+ id=f"chatcmpl-{int(time.time())}",
453
+ object="chat.completion.chunk",
454
+ created=int(time.time()),
455
+ model=DEFAULT_MODEL_NAME,
456
+ choices=[Choice(
457
+ index=0,
458
+ delta=Delta(),
459
+ finish_reason="stop"
460
+ )]
461
+ )
462
+ yield f"data: {end_chunk.model_dump_json()}\n\n"
463
+ yield "data: [DONE]\n\n"
464
+ break
465
+
466
+ debug_log(f"解析成功 - 类型: {upstream_data.type}, 阶段: {upstream_data.data.phase}, "
467
+ f"内容长度: {len(upstream_data.data.delta_content)}, 完成: {upstream_data.data.done}")
468
+
469
+ # 处理EditContent在最初的answer信息(只发送一次)
470
+ if (not sent_initial_answer and
471
+ upstream_data.data.edit_content and
472
+ upstream_data.data.phase == "answer"):
473
+
474
+ out = upstream_data.data.edit_content
475
+ if out:
476
+ parts = out.split("</details>")
477
+ if len(parts) > 1:
478
+ content = parts[1]
479
+ if content:
480
+ debug_log(f"发送普通内容: {content}")
481
+ chunk = OpenAIResponse(
482
+ id=f"chatcmpl-{int(time.time())}",
483
+ object="chat.completion.chunk",
484
+ created=int(time.time()),
485
+ model=DEFAULT_MODEL_NAME,
486
+ choices=[Choice(
487
+ index=0,
488
+ delta=Delta(content=content)
489
+ )]
490
+ )
491
+ yield f"data: {chunk.model_dump_json()}\n\n"
492
+ sent_initial_answer = True
493
+
494
+ # 处理DeltaContent
495
+ if upstream_data.data.delta_content:
496
+ out = upstream_data.data.delta_content
497
+
498
+ if upstream_data.data.phase == "thinking":
499
+ out = transform_thinking(out)
500
+ # 思考内容使用 reasoning_content 字段
501
+ if out:
502
+ debug_log(f"发送思考内容: {out}")
503
+ chunk = OpenAIResponse(
504
+ id=f"chatcmpl-{int(time.time())}",
505
+ object="chat.completion.chunk",
506
+ created=int(time.time()),
507
+ model=DEFAULT_MODEL_NAME,
508
+ choices=[Choice(
509
+ index=0,
510
+ delta=Delta(reasoning_content=out)
511
+ )]
512
+ )
513
+ yield f"data: {chunk.model_dump_json()}\n\n"
514
+ else:
515
+ # 普通内容使用 content 字段
516
+ if out:
517
+ debug_log(f"发送普通内容: {out}")
518
+ chunk = OpenAIResponse(
519
+ id=f"chatcmpl-{int(time.time())}",
520
+ object="chat.completion.chunk",
521
+ created=int(time.time()),
522
+ model=DEFAULT_MODEL_NAME,
523
+ choices=[Choice(
524
+ index=0,
525
+ delta=Delta(content=out)
526
+ )]
527
+ )
528
+ yield f"data: {chunk.model_dump_json()}\n\n"
529
+
530
+ # 检查是否结束
531
+ if upstream_data.data.done or upstream_data.data.phase == "done":
532
+ debug_log("检测到流结束信号")
533
+
534
+ # 发送结束chunk
535
+ end_chunk = OpenAIResponse(
536
+ id=f"chatcmpl-{int(time.time())}",
537
+ object="chat.completion.chunk",
538
+ created=int(time.time()),
539
+ model=DEFAULT_MODEL_NAME,
540
+ choices=[Choice(
541
+ index=0,
542
+ delta=Delta(),
543
+ finish_reason="stop"
544
+ )]
545
+ )
546
+ yield f"data: {end_chunk.model_dump_json()}\n\n"
547
+ yield "data: [DONE]\n\n"
548
+ debug_log(f"流式响应完成,共处理{line_count}行")
549
+ break
550
+
551
+
552
+ async def handle_non_stream_response(upstream_req: UpstreamRequest, chat_id: str, auth_token: str) -> JSONResponse:
553
+ """处理非流式响应"""
554
+ debug_log(f"开始处理非流式响应 (chat_id={chat_id})")
555
+
556
+ try:
557
+ response = await call_upstream_with_headers(upstream_req, chat_id, auth_token)
558
+ except Exception as e:
559
+ debug_log(f"调用上游失败: {e}")
560
+ raise HTTPException(status_code=502, detail="Failed to call upstream")
561
+
562
+ if response.status_code != 200:
563
+ debug_log(f"上游返回错误状态: {response.status_code}")
564
+ if DEBUG_MODE:
565
+ debug_log(f"上游错误响应: {response.text}")
566
+ raise HTTPException(status_code=502, detail="Upstream error")
567
+
568
+ # 收集完整响应
569
+ full_content = []
570
+ debug_log("开始收集完整响应内容")
571
+
572
+ async for line in response.aiter_lines():
573
+ if not line.startswith("data: "):
574
+ continue
575
+
576
+ data_str = line[6:]
577
+ if not data_str:
578
+ continue
579
+
580
+ try:
581
+ upstream_data = UpstreamData.model_validate_json(data_str)
582
+ except Exception:
583
+ continue
584
+
585
+ if upstream_data.data.delta_content:
586
+ out = upstream_data.data.delta_content
587
+
588
+ if upstream_data.data.phase == "thinking":
589
+ out = transform_thinking(out)
590
+
591
+ if out:
592
+ full_content.append(out)
593
+
594
+ if upstream_data.data.done or upstream_data.data.phase == "done":
595
+ debug_log("检测到完成信号,停止收集")
596
+ break
597
+
598
+ final_content = "".join(full_content)
599
+ debug_log(f"内容收集完成,最终长度: {len(final_content)}")
600
+
601
+ # 构造完整响应
602
+ response_data = OpenAIResponse(
603
+ id=f"chatcmpl-{int(time.time())}",
604
+ object="chat.completion",
605
+ created=int(time.time()),
606
+ model=DEFAULT_MODEL_NAME,
607
+ choices=[Choice(
608
+ index=0,
609
+ message=Message(
610
+ role="assistant",
611
+ content=final_content
612
+ ),
613
+ finish_reason="stop"
614
+ )],
615
+ usage=Usage()
616
+ )
617
+
618
+ debug_log("非流式响应发送完成")
619
+ return JSONResponse(content=response_data.model_dump(exclude_none=True))
620
+
621
+
622
+ # 根路径处理器
623
+ @app.get("/")
624
+ async def root():
625
+ return {"message": "OpenAI Compatible API Server"}
626
+
627
+
628
+ if __name__ == "__main__":
629
+ import uvicorn
630
+ uvicorn.run("main:app", host="0.0.0.0", port=PORT, reload=True)
pyproject.toml ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [build-system]
2
+ requires = ["hatchling"]
3
+ build-backend = "hatchling.build"
4
+
5
+ [project]
6
+ name = "z-ai2api-python"
7
+ version = "0.1.0"
8
+ description = "一个为 Z.ai 提供 OpenAI API 兼容接口的 Python 代理服务"
9
+ readme = "README.md"
10
+ requires-python = ">=3.8"
11
+ license = {text = "MIT"}
12
+ authors = [
13
+ {name = "Contributors"}
14
+ ]
15
+ classifiers = [
16
+ "Development Status :: 4 - Beta",
17
+ "Intended Audience :: Developers",
18
+ "License :: OSI Approved :: MIT License",
19
+ "Operating System :: OS Independent",
20
+ "Programming Language :: Python :: 3",
21
+ "Programming Language :: Python :: 3.8",
22
+ "Programming Language :: Python :: 3.9",
23
+ "Programming Language :: Python :: 3.10",
24
+ "Programming Language :: Python :: 3.11",
25
+ "Programming Language :: Python :: 3.12",
26
+ "Topic :: Internet :: WWW/HTTP :: HTTP Servers",
27
+ "Topic :: Software Development :: Libraries :: Python Modules",
28
+ ]
29
+ dependencies = [
30
+ "fastapi==0.104.1",
31
+ "uvicorn[standard]==0.24.0",
32
+ "httpx==0.25.2",
33
+ "pydantic==2.5.0",
34
+ ]
35
+
36
+ [project.scripts]
37
+ z-ai2api = "main:app"
38
+
39
+ [tool.hatch.build.targets.wheel]
40
+ packages = ["."]
41
+
42
+ [tool.uv]
43
+ dev-dependencies = [
44
+ "pytest>=7.0.0",
45
+ "pytest-asyncio>=0.21.0",
46
+ "httpx>=0.25.0",
47
+ "ruff>=0.1.0",
48
+ ]
49
+
50
+ [tool.ruff]
51
+ line-length = 88
52
+ target-version = "py38"
53
+ select = ["E", "F", "I", "B"]
54
+ ignore = []
55
+
56
+ [tool.ruff.isort]
57
+ known-first-party = []
58
+
59
+ [tool.pytest.ini_options]
60
+ asyncio_mode = "auto"
61
+ testpaths = ["tests"]
62
+ python_files = ["test_*.py"]
63
+ python_functions = ["test_*"]
requirements.txt ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ fastapi==0.104.1
2
+ uvicorn[standard]==0.24.0
3
+ httpx==0.25.2
4
+ pydantic==2.5.0
uv.lock ADDED
The diff for this file is too large to render. See raw diff