icebear0828 Claude Opus 4.6 commited on
Commit
44b20f4
·
1 Parent(s): 9206ebc

fix: strip service_tier from request body + add OpenAI-Beta header + update gpt-5.4 efforts

Browse files

- Codex backend rejects service_tier in request body ("Unsupported service_tier: fast").
Desktop app handles Fast mode at app-server level, not via the API.
Proxy now strips service_tier before sending to backend.
- Add OpenAI-Beta: responses_websockets=2026-02-06 header (matching Desktop binary).
- Update gpt-5.4 reasoning efforts: minimal→none, add xhigh (matching backend).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (3) hide show
  1. CHANGELOG.md +4 -1
  2. config/models.yaml +5 -4
  3. src/proxy/codex-api.ts +8 -1
CHANGELOG.md CHANGED
@@ -10,6 +10,9 @@
10
 
11
  - README Docker 快速开始补充 `cp .env.example .env` 步骤,修复新用户因缺少 `.env` 文件导致 `docker compose up -d` 启动失败的问题 (#38)
12
  - 识别 `response.output_item.done`、`response.incomplete`、`response.queued` Codex SSE 事件,消除 "Unknown event" 日志噪音
 
 
 
13
  - 流式 SSE 请求不再设置 `--max-time` 墙钟超时,修复思考链(reasoning/thinking)在 60 秒处中断的问题;连接保护由 header 超时 + AbortSignal 提供,非流式请求(models、usage)超时不受影响
14
 
15
  ### Added
@@ -17,7 +20,7 @@
17
  - `/v1/responses` 端点:Codex Responses API 直通,无格式转换,支持原始 SSE 事件流和多账号负载均衡
18
 
19
  - 模型名后缀系统:通过模型名嵌入推理等级和速度模式(如 `gpt-5.4-high-fast`),CLI 工具(Claude Code、opencode 等)无需额外参数即可控制推理强度和 Fast 模式
20
- - `service_tier` 支持接受 API 请求体 `service_tier` 字段("fast" / "flex")或通过 `-fast` 模型名后缀自动设置
21
  - Dashboard Speed 切换:模型选择器下方新增 Standard / Fast 速度切换按钮
22
 
23
  - 代理分配管理页面(`#/proxy-settings`):双栏矩阵式布局,批量管理数百账号的代理分配
 
10
 
11
  - README Docker 快速开始补充 `cp .env.example .env` 步骤,修复新用户因缺少 `.env` 文件导致 `docker compose up -d` 启动失败的问题 (#38)
12
  - 识别 `response.output_item.done`、`response.incomplete`、`response.queued` Codex SSE 事件,消除 "Unknown event" 日志噪音
13
+ - 剥离 `service_tier` 字段:Codex 后端不接受请求体中的 `service_tier`,现在 proxy 在发送前自动移除,修复 `-fast` 后缀导致 "Unsupported service_tier" 报错
14
+ - 更新 gpt-5.4 推理等级:`minimal` → `none`,新增 `xhigh`(与后端实际支持的值对齐)
15
+ - 添加 `OpenAI-Beta` 请求头:与 Codex Desktop 保持一致(`responses_websockets=2026-02-06`)
16
  - 流式 SSE 请求不再设置 `--max-time` 墙钟超时,修复思考链(reasoning/thinking)在 60 秒处中断的问题;连接保护由 header 超时 + AbortSignal 提供,非流式请求(models、usage)超时不受影响
17
 
18
  ### Added
 
20
  - `/v1/responses` 端点:Codex Responses API 直通,无格式转换,支持原始 SSE 事件流和多账号负载均衡
21
 
22
  - 模型名后缀系统:通过模型名嵌入推理等级和速度模式(如 `gpt-5.4-high-fast`),CLI 工具(Claude Code、opencode 等)无需额外参数即可控制推理强度和 Fast 模式
23
+ - `service_tier` 后缀解析通过 `-fast`/`-flex` 模型名后缀解析,保留在 proxy 层元数据(Codex 后端不接受 `service_tier` 请求体字段,Desktop 在 app-server 层处理)
24
  - Dashboard Speed 切换:模型选择器下方新增 Standard / Fast 速度切换按钮
25
 
26
  - 代理分配管理页面(`#/proxy-settings`):双栏矩阵式布局,批量管理数百账号的代理分配
config/models.yaml CHANGED
@@ -13,10 +13,11 @@ models:
13
  description: Latest Codex flagship model
14
  isDefault: true
15
  supportedReasoningEfforts:
16
- - { reasoningEffort: minimal, description: "Minimal reasoning" }
17
- - { reasoningEffort: low, description: "Fastest responses" }
18
- - { reasoningEffort: medium, description: "Balanced speed and quality" }
19
- - { reasoningEffort: high, description: "Deepest reasoning" }
 
20
  defaultReasoningEffort: medium
21
  inputModalities: [text, image]
22
  supportsPersonality: true
 
13
  description: Latest Codex flagship model
14
  isDefault: true
15
  supportedReasoningEfforts:
16
+ - { reasoningEffort: none, description: "No reasoning" }
17
+ - { reasoningEffort: low, description: "Fastest responses" }
18
+ - { reasoningEffort: medium, description: "Balanced speed and quality" }
19
+ - { reasoningEffort: high, description: "Deepest reasoning" }
20
+ - { reasoningEffort: xhigh, description: "Extended deep reasoning" }
21
  defaultReasoningEffort: medium
22
  inputModalities: [text, image]
23
  supportsPersonality: true
src/proxy/codex-api.ts CHANGED
@@ -248,11 +248,18 @@ export class CodexApi {
248
  buildHeadersWithContentType(this.token, this.accountId),
249
  );
250
  headers["Accept"] = "text/event-stream";
 
 
 
 
 
 
 
251
 
252
  // No wall-clock timeout for streaming SSE — header timeout + AbortSignal provide protection
253
  let transportRes;
254
  try {
255
- transportRes = await transport.post(url, headers, JSON.stringify(request), signal, undefined, this.proxyUrl);
256
  } catch (err) {
257
  const msg = err instanceof Error ? err.message : String(err);
258
  throw new CodexApiError(0, msg);
 
248
  buildHeadersWithContentType(this.token, this.accountId),
249
  );
250
  headers["Accept"] = "text/event-stream";
251
+ // Codex Desktop sends this beta header to enable newer API features
252
+ headers["OpenAI-Beta"] = "responses_websockets=2026-02-06";
253
+
254
+ // Strip service_tier from body — Codex backend doesn't accept it as a body field.
255
+ // Desktop app handles Fast mode internally (lighter reasoning), not via the API.
256
+ const { service_tier: _st, ...bodyWithoutServiceTier } = request;
257
+ const body = JSON.stringify(bodyWithoutServiceTier);
258
 
259
  // No wall-clock timeout for streaming SSE — header timeout + AbortSignal provide protection
260
  let transportRes;
261
  try {
262
+ transportRes = await transport.post(url, headers, body, signal, undefined, this.proxyUrl);
263
  } catch (err) {
264
  const msg = err instanceof Error ? err.message : String(err);
265
  throw new CodexApiError(0, msg);