icebear icebear0828 Claude Opus 4.6 commited on
Commit
5416ffb
·
unverified ·
1 Parent(s): 2041133

feat: model name suffix system + service_tier + Docker persistence fix (#35)

Browse files

* feat: model name suffix system + service_tier (fast mode) support

Add a unified model name suffix convention so CLI tools (Claude Code,
opencode, etc.) can control reasoning effort and speed mode purely
through the model name — no extra API fields needed.

Suffix parsing (`parseModelName`):
- `-fast` / `-flex` → `service_tier`
- `-minimal` / `-low` / `-medium` / `-high` / `-xhigh` → `reasoning_effort`
- Known model IDs and aliases are never suffix-parsed
- Priority: explicit API field > suffix > model default > config default

Backend:
- `parseModelName()` in model-store.ts (core suffix resolver)
- `service_tier` field in Zod schema + CodexResponsesRequest interface
- `default_service_tier` in config schema + default.yaml
- All 3 translation layers (OpenAI/Anthropic/Gemini) use parseModelName()

Dashboard:
- Standard / Fast speed toggle in ApiConfig
- AnthropicSetup shows compound model name with suffixes
- CodeExamples builds compound model name, suppresses redundant params

Examples: `gpt-5.4-fast`, `gpt-5.4-high-fast`, `codex-fast`

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: Docker data persistence — login/cookies not saved

Root cause: VOLUME declared before chown (changes discarded by Docker),
bind-mount directories auto-created as root:root, USER node can't write.

Fix: replace static USER node with entrypoint that chowns mounted volumes
then drops to node via gosu.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: icebear0828 <icebear0828@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

.gitattributes ADDED
@@ -0,0 +1 @@
 
 
1
+ *.sh eol=lf
CHANGELOG.md CHANGED
@@ -12,6 +12,10 @@
12
 
13
  ### Added
14
 
 
 
 
 
15
  - 代理分配管理页面(`#/proxy-settings`):双栏矩阵式布局,批量管理数百账号的代理分配
16
  - 左栏代理组列表:按 Global/Direct/Auto/各代理分组显示计数徽章,点击筛选
17
  - 右栏账号表格:搜索、状态筛选、分页(50条/页)、Shift+点击连续多选、每行独立代理下拉
 
12
 
13
  ### Added
14
 
15
+ - 模型名后缀系统:通过模型名嵌入推理等级和速度模式(如 `gpt-5.4-high-fast`),CLI 工具(Claude Code、opencode 等)无需额外参数即可控制推理强度和 Fast 模式
16
+ - `service_tier` 支持:接受 API 请求体中的 `service_tier` 字段("fast" / "flex"),或通过 `-fast` 模型名后缀自动设置
17
+ - Dashboard Speed 切换:模型选择器下方新增 Standard / Fast 速度切换按钮
18
+
19
  - 代理分配管理页面(`#/proxy-settings`):双栏矩阵式布局,批量管理数百账号的代理分配
20
  - 左栏代理组列表:按 Global/Direct/Auto/各代理分组显示计数徽章,点击筛选
21
  - 右栏账号表格:搜索、状态筛选、分页(50条/页)、Shift+点击连续多选、每行独立代理下拉
Dockerfile CHANGED
@@ -2,8 +2,9 @@ FROM node:20-slim
2
 
3
  # curl: needed by setup-curl.ts and full-update.ts
4
  # unzip: needed by full-update.ts to extract Codex.app
 
5
  RUN apt-get update && \
6
- apt-get install -y --no-install-recommends curl unzip ca-certificates && \
7
  rm -rf /var/lib/apt/lists/*
8
 
9
  WORKDIR /app
@@ -30,14 +31,16 @@ RUN cd web && npm run build && cd .. && npx tsc
30
  # 5) Prune dev deps, re-add tsx (needed at runtime by update-checker fork())
31
  RUN npm prune --omit=dev && npm install --no-save tsx
32
 
33
- VOLUME /app/data
34
  EXPOSE 8080
35
 
36
- # Ensure writable directories for non-root user
37
- RUN mkdir -p /app/data && chown -R node:node /app/data /app/config
 
 
 
38
 
39
- USER node
40
  HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
41
  CMD curl -fs http://localhost:8080/health || exit 1
42
 
 
43
  CMD ["node", "dist/index.js"]
 
2
 
3
  # curl: needed by setup-curl.ts and full-update.ts
4
  # unzip: needed by full-update.ts to extract Codex.app
5
+ # gosu: needed by entrypoint to drop from root to node user
6
  RUN apt-get update && \
7
+ apt-get install -y --no-install-recommends curl unzip ca-certificates gosu && \
8
  rm -rf /var/lib/apt/lists/*
9
 
10
  WORKDIR /app
 
31
  # 5) Prune dev deps, re-add tsx (needed at runtime by update-checker fork())
32
  RUN npm prune --omit=dev && npm install --no-save tsx
33
 
 
34
  EXPOSE 8080
35
 
36
+ # Ensure data dir exists in the image (bind mount may override at runtime)
37
+ RUN mkdir -p /app/data
38
+
39
+ COPY docker-entrypoint.sh /
40
+ RUN chmod +x /docker-entrypoint.sh
41
 
 
42
  HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \
43
  CMD curl -fs http://localhost:8080/health || exit 1
44
 
45
+ ENTRYPOINT ["/docker-entrypoint.sh"]
46
  CMD ["node", "dist/index.js"]
README.md CHANGED
@@ -175,6 +175,9 @@ curl http://localhost:8080/v1/chat/completions \
175
  | `gpt-5.1-codex-max` | `codex-max` | low / medium / high | 深度推理编程模型 |
176
  | `gpt-5.1-codex-mini` | `codex-mini` | low / medium / high | 轻量快速编程模型 |
177
 
 
 
 
178
  > 模型列表会随 Codex Desktop 版本更新自动同步。后端也会动态获取最新模型目录。
179
 
180
  ## 🔗 客户端接入 (Client Setup)
@@ -186,18 +189,22 @@ curl http://localhost:8080/v1/chat/completions \
186
  ```bash
187
  export ANTHROPIC_BASE_URL=http://localhost:8080
188
  export ANTHROPIC_API_KEY=your-api-key
189
- export ANTHROPIC_MODEL=claude-opus-4-6 # Opus → gpt-5.4(默认)
190
- # export ANTHROPIC_MODEL=claude-sonnet-4-6 # Sonnet → gpt-5.3-codex
 
 
 
 
191
  # export ANTHROPIC_MODEL=claude-haiku-4-5-20251001 # Haiku → gpt-5.1-codex-mini
192
 
193
  claude # 启动 Claude Code
194
  ```
195
 
196
- | Claude Code 模型 | 映射到 Codex 模型 |
197
- |------------------|------------------|
198
- | Opus (`claude-opus-4-6`) | `gpt-5.4` |
199
- | Sonnet (`claude-sonnet-4-6`) | `gpt-5.3-codex` |
200
- | Haiku (`claude-haiku-4-5-20251001`) | `gpt-5.1-codex-mini` |
201
 
202
  > 也可以在控制面板 (`http://localhost:8080`) 的 **Anthropic SDK Setup** 卡片中一键复制环境变量。
203
 
@@ -278,7 +285,7 @@ for await (const chunk of stream) {
278
  | `server` | `host`, `port`, `proxy_api_key` | 服务监听地址与 API 密钥(见下方说明) |
279
  | `api` | `base_url`, `timeout_seconds` | 上游 API 地址与请求超时 |
280
  | `client` | `app_version`, `build_number`, `chromium_version` | 模拟的 Codex Desktop 版本与 Chromium 版本 |
281
- | `model` | `default`, `default_reasoning_effort` | 默认模型推理强度 |
282
  | `auth` | `rotation_strategy`, `rate_limit_backoff_seconds` | 轮换策略与限流退避 |
283
  | `tls` | `curl_binary`, `impersonate_profile`, `proxy_url` | TLS 伪装与代理配置 |
284
 
 
175
  | `gpt-5.1-codex-max` | `codex-max` | low / medium / high | 深度推理编程模型 |
176
  | `gpt-5.1-codex-mini` | `codex-mini` | low / medium / high | 轻量快速编程模型 |
177
 
178
+ > **模型名后缀**:在任意模型名后追加 `-fast` 启用 Fast 模式,追加 `-high`/`-low` 等切换推理等级。
179
+ > 例如:`gpt-5.4-fast`、`gpt-5.4-high-fast`、`codex-fast`。
180
+ >
181
  > 模型列表会随 Codex Desktop 版本更新自动同步。后端也会动态获取最新模型目录。
182
 
183
  ## 🔗 客户端接入 (Client Setup)
 
189
  ```bash
190
  export ANTHROPIC_BASE_URL=http://localhost:8080
191
  export ANTHROPIC_API_KEY=your-api-key
192
+ # 默认 Opus 4.6 → gpt-5.4,无需设置 ANTHROPIC_MODEL
193
+ # 如需切换模型或启用后缀:
194
+ # export ANTHROPIC_MODEL=codex-fast # → gpt-5.4 + Fast 模式
195
+ # export ANTHROPIC_MODEL=gpt-5.4-high # → gpt-5.4 + high 推理
196
+ # export ANTHROPIC_MODEL=gpt-5.4-high-fast # → gpt-5.4 + high + Fast
197
+ # export ANTHROPIC_MODEL=claude-sonnet-4-6 # Sonnet → gpt-5.3-codex
198
  # export ANTHROPIC_MODEL=claude-haiku-4-5-20251001 # Haiku → gpt-5.1-codex-mini
199
 
200
  claude # 启动 Claude Code
201
  ```
202
 
203
+ | Claude Code 模型 | 映射到 Codex 模型 | 说明 |
204
+ |------------------|------------------|------|
205
+ | Opus (默认) | `gpt-5.4` | 无需设置 `ANTHROPIC_MODEL` |
206
+ | Sonnet (`claude-sonnet-4-6`) | `gpt-5.3-codex` | |
207
+ | Haiku (`claude-haiku-4-5-20251001`) | `gpt-5.1-codex-mini` | |
208
 
209
  > 也可以在控制面板 (`http://localhost:8080`) 的 **Anthropic SDK Setup** 卡片中一键复制环境变量。
210
 
 
285
  | `server` | `host`, `port`, `proxy_api_key` | 服务监听地址与 API 密钥(见下方说明) |
286
  | `api` | `base_url`, `timeout_seconds` | 上游 API 地址与请求超时 |
287
  | `client` | `app_version`, `build_number`, `chromium_version` | 模拟的 Codex Desktop 版本与 Chromium 版本 |
288
+ | `model` | `default`, `default_reasoning_effort`, `default_service_tier` | 默认模型推理强度与速度模式 |
289
  | `auth` | `rotation_strategy`, `rate_limit_backoff_seconds` | 轮换策略与限流退避 |
290
  | `tls` | `curl_binary`, `impersonate_profile`, `proxy_url` | TLS 伪装与代理配置 |
291
 
README_EN.md CHANGED
@@ -175,6 +175,9 @@ curl http://localhost:8080/v1/chat/completions \
175
  | `gpt-5.1-codex-max` | `codex-max` | low / medium / high | Deep reasoning coding model |
176
  | `gpt-5.1-codex-mini` | `codex-mini` | low / medium / high | Lightweight, fast coding model |
177
 
 
 
 
178
  > Models are automatically synced when new Codex Desktop versions are released. The backend also dynamically fetches the latest model catalog.
179
 
180
  ## 🔗 Client Setup
@@ -186,18 +189,22 @@ Set environment variables to route Claude Code through codex-proxy:
186
  ```bash
187
  export ANTHROPIC_BASE_URL=http://localhost:8080
188
  export ANTHROPIC_API_KEY=your-api-key
189
- export ANTHROPIC_MODEL=claude-opus-4-6 # Opus → gpt-5.4 (default)
190
- # export ANTHROPIC_MODEL=claude-sonnet-4-6 # Sonnet gpt-5.3-codex
 
 
 
 
191
  # export ANTHROPIC_MODEL=claude-haiku-4-5-20251001 # Haiku → gpt-5.1-codex-mini
192
 
193
  claude # Launch Claude Code
194
  ```
195
 
196
- | Claude Code Model | Maps to Codex Model |
197
- |-------------------|---------------------|
198
- | Opus (`claude-opus-4-6`) | `gpt-5.4` |
199
- | Sonnet (`claude-sonnet-4-6`) | `gpt-5.3-codex` |
200
- | Haiku (`claude-haiku-4-5-20251001`) | `gpt-5.1-codex-mini` |
201
 
202
  > You can also copy environment variables from the **Anthropic SDK Setup** card in the dashboard (`http://localhost:8080`).
203
 
@@ -278,7 +285,7 @@ All configuration is in `config/default.yaml`:
278
  | `server` | `host`, `port`, `proxy_api_key` | Listen address and API key |
279
  | `api` | `base_url`, `timeout_seconds` | Upstream API URL and timeout |
280
  | `client_identity` | `app_version`, `build_number` | Codex Desktop version to impersonate |
281
- | `model` | `default`, `default_reasoning_effort` | Default model and reasoning effort |
282
  | `auth` | `rotation_strategy`, `rate_limit_backoff_seconds` | Rotation strategy and rate limit backoff |
283
 
284
  ### Environment Variable Overrides
 
175
  | `gpt-5.1-codex-max` | `codex-max` | low / medium / high | Deep reasoning coding model |
176
  | `gpt-5.1-codex-mini` | `codex-mini` | low / medium / high | Lightweight, fast coding model |
177
 
178
+ > **Model name suffixes**: Append `-fast` to any model name to enable Fast mode, or `-high`/`-low` etc. to change reasoning effort.
179
+ > Examples: `gpt-5.4-fast`, `gpt-5.4-high-fast`, `codex-fast`.
180
+ >
181
  > Models are automatically synced when new Codex Desktop versions are released. The backend also dynamically fetches the latest model catalog.
182
 
183
  ## 🔗 Client Setup
 
189
  ```bash
190
  export ANTHROPIC_BASE_URL=http://localhost:8080
191
  export ANTHROPIC_API_KEY=your-api-key
192
+ # Default Opus 4.6 → gpt-5.4, no need to set ANTHROPIC_MODEL
193
+ # To switch models or use suffixes:
194
+ # export ANTHROPIC_MODEL=codex-fast # → gpt-5.4 + Fast mode
195
+ # export ANTHROPIC_MODEL=gpt-5.4-high # → gpt-5.4 + high reasoning
196
+ # export ANTHROPIC_MODEL=gpt-5.4-high-fast # → gpt-5.4 + high + Fast
197
+ # export ANTHROPIC_MODEL=claude-sonnet-4-6 # Sonnet → gpt-5.3-codex
198
  # export ANTHROPIC_MODEL=claude-haiku-4-5-20251001 # Haiku → gpt-5.1-codex-mini
199
 
200
  claude # Launch Claude Code
201
  ```
202
 
203
+ | Claude Code Model | Maps to Codex Model | Notes |
204
+ |-------------------|---------------------|-------|
205
+ | Opus (default) | `gpt-5.4` | No need to set `ANTHROPIC_MODEL` |
206
+ | Sonnet (`claude-sonnet-4-6`) | `gpt-5.3-codex` | |
207
+ | Haiku (`claude-haiku-4-5-20251001`) | `gpt-5.1-codex-mini` | |
208
 
209
  > You can also copy environment variables from the **Anthropic SDK Setup** card in the dashboard (`http://localhost:8080`).
210
 
 
285
  | `server` | `host`, `port`, `proxy_api_key` | Listen address and API key |
286
  | `api` | `base_url`, `timeout_seconds` | Upstream API URL and timeout |
287
  | `client_identity` | `app_version`, `build_number` | Codex Desktop version to impersonate |
288
+ | `model` | `default`, `default_reasoning_effort`, `default_service_tier` | Default model, reasoning effort and speed mode |
289
  | `auth` | `rotation_strategy`, `rate_limit_backoff_seconds` | Rotation strategy and rate limit backoff |
290
 
291
  ### Environment Variable Overrides
config/default.yaml CHANGED
@@ -11,6 +11,7 @@ client:
11
  model:
12
  default: gpt-5.4
13
  default_reasoning_effort: medium
 
14
  suppress_desktop_directives: true
15
  auth:
16
  jwt_token: null
 
11
  model:
12
  default: gpt-5.4
13
  default_reasoning_effort: medium
14
+ default_service_tier: null
15
  suppress_desktop_directives: true
16
  auth:
17
  jwt_token: null
docker-entrypoint.sh ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/sh
2
+ set -e
3
+
4
+ # Ensure mounted volumes are writable by the node user (UID 1000).
5
+ # When Docker auto-creates bind-mount directories on the host,
6
+ # they default to root:root — the node user can't write to them.
7
+ chown -R node:node /app/data /app/config 2>/dev/null || true
8
+
9
+ exec gosu node "$@"
shared/hooks/use-status.ts CHANGED
@@ -48,6 +48,7 @@ export function useStatus(accountCount: number) {
48
  const [selectedModel, setSelectedModel] = useState("gpt-5.4");
49
  const [modelCatalog, setModelCatalog] = useState<CatalogModel[]>([]);
50
  const [selectedEffort, setSelectedEffort] = useState("medium");
 
51
 
52
  const loadModels = useCallback(async () => {
53
  try {
@@ -114,6 +115,8 @@ export function useStatus(accountCount: number) {
114
  setSelectedModel,
115
  selectedEffort,
116
  setSelectedEffort,
 
 
117
  modelFamilies,
118
  modelCatalog,
119
  };
 
48
  const [selectedModel, setSelectedModel] = useState("gpt-5.4");
49
  const [modelCatalog, setModelCatalog] = useState<CatalogModel[]>([]);
50
  const [selectedEffort, setSelectedEffort] = useState("medium");
51
+ const [selectedSpeed, setSelectedSpeed] = useState<string | null>(null);
52
 
53
  const loadModels = useCallback(async () => {
54
  try {
 
115
  setSelectedModel,
116
  selectedEffort,
117
  setSelectedEffort,
118
+ selectedSpeed,
119
+ setSelectedSpeed,
120
  modelFamilies,
121
  modelCatalog,
122
  };
shared/i18n/translations.ts CHANGED
@@ -138,6 +138,9 @@ export const translations = {
138
  exportSuccess: "Export complete",
139
  importFile: "Select JSON file",
140
  downloadJson: "Download JSON",
 
 
 
141
  },
142
  zh: {
143
  serverOnline: "\u670d\u52a1\u8fd0\u884c\u4e2d",
@@ -281,6 +284,9 @@ export const translations = {
281
  exportSuccess: "\u5bfc\u51fa\u6210\u529f",
282
  importFile: "\u9009\u62e9 JSON \u6587\u4ef6",
283
  downloadJson: "\u4e0b\u8f7d JSON",
 
 
 
284
  },
285
  } as const;
286
 
 
138
  exportSuccess: "Export complete",
139
  importFile: "Select JSON file",
140
  downloadJson: "Download JSON",
141
+ speed: "Speed",
142
+ speedStandard: "Standard",
143
+ speedFast: "Fast",
144
  },
145
  zh: {
146
  serverOnline: "\u670d\u52a1\u8fd0\u884c\u4e2d",
 
284
  exportSuccess: "\u5bfc\u51fa\u6210\u529f",
285
  importFile: "\u9009\u62e9 JSON \u6587\u4ef6",
286
  downloadJson: "\u4e0b\u8f7d JSON",
287
+ speed: "\u901f\u5ea6",
288
+ speedStandard: "\u6807\u51c6",
289
+ speedFast: "\u5feb\u901f",
290
  },
291
  } as const;
292
 
src/config.ts CHANGED
@@ -22,6 +22,7 @@ const ConfigSchema = z.object({
22
  model: z.object({
23
  default: z.string().default("gpt-5.2-codex"),
24
  default_reasoning_effort: z.string().default("medium"),
 
25
  suppress_desktop_directives: z.boolean().default(true),
26
  }),
27
  auth: z.object({
 
22
  model: z.object({
23
  default: z.string().default("gpt-5.2-codex"),
24
  default_reasoning_effort: z.string().default("medium"),
25
+ default_service_tier: z.string().nullable().default(null),
26
  suppress_desktop_directives: z.boolean().default(true),
27
  }),
28
  auth: z.object({
src/models/model-store.ts CHANGED
@@ -186,6 +186,63 @@ export function applyBackendModels(backendModels: BackendModelEntry[]): void {
186
  );
187
  }
188
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
189
  // ── Getters ────────────────────────────────────────────────────────
190
 
191
  /**
 
186
  );
187
  }
188
 
189
+ // ── Model name suffix parsing ───────────────────────────────────────
190
+
191
+ export interface ParsedModelName {
192
+ modelId: string;
193
+ serviceTier: string | null;
194
+ reasoningEffort: string | null;
195
+ }
196
+
197
+ const SERVICE_TIER_SUFFIXES = new Set(["fast", "flex"]);
198
+ const EFFORT_SUFFIXES = new Set(["minimal", "low", "medium", "high", "xhigh"]);
199
+
200
+ /**
201
+ * Parse a model name that may contain embedded suffixes for service_tier and reasoning_effort.
202
+ *
203
+ * Resolution:
204
+ * 1. If full name is a known model ID or alias → use as-is
205
+ * 2. Otherwise, strip known suffixes from right:
206
+ * - `-fast`, `-flex` → service_tier
207
+ * - `-minimal`, `-low`, `-medium`, `-high`, `-xhigh` → reasoning_effort
208
+ * 3. Resolve remaining name as model ID/alias
209
+ */
210
+ export function parseModelName(input: string): ParsedModelName {
211
+ const trimmed = input.trim();
212
+
213
+ // 1. Known model or alias? Use as-is
214
+ if (_aliases[trimmed] || _catalog.some((m) => m.id === trimmed)) {
215
+ return { modelId: resolveModelId(trimmed), serviceTier: null, reasoningEffort: null };
216
+ }
217
+
218
+ // 2. Try stripping suffixes from right
219
+ let remaining = trimmed;
220
+ let serviceTier: string | null = null;
221
+ let reasoningEffort: string | null = null;
222
+
223
+ // Strip -fast/-flex (rightmost)
224
+ for (const tier of SERVICE_TIER_SUFFIXES) {
225
+ if (remaining.endsWith(`-${tier}`)) {
226
+ serviceTier = tier;
227
+ remaining = remaining.slice(0, -(tier.length + 1));
228
+ break;
229
+ }
230
+ }
231
+
232
+ // Strip -high/-low/etc (next from right)
233
+ for (const effort of EFFORT_SUFFIXES) {
234
+ if (remaining.endsWith(`-${effort}`)) {
235
+ reasoningEffort = effort;
236
+ remaining = remaining.slice(0, -(effort.length + 1));
237
+ break;
238
+ }
239
+ }
240
+
241
+ // 3. Resolve remaining as model
242
+ const modelId = resolveModelId(remaining);
243
+ return { modelId, serviceTier, reasoningEffort };
244
+ }
245
+
246
  // ── Getters ────────────────────────────────────────────────────────
247
 
248
  /**
src/proxy/codex-api.ts CHANGED
@@ -28,6 +28,8 @@ export interface CodexResponsesRequest {
28
  store: false;
29
  /** Optional: reasoning effort + summary mode */
30
  reasoning?: { effort?: string; summary?: string };
 
 
31
  /** Optional: tools available to the model */
32
  tools?: unknown[];
33
  /** Optional: tool choice strategy */
 
28
  store: false;
29
  /** Optional: reasoning effort + summary mode */
30
  reasoning?: { effort?: string; summary?: string };
31
+ /** Optional: service tier ("fast" / "flex") */
32
+ service_tier?: string | null;
33
  /** Optional: tools available to the model */
34
  tools?: unknown[];
35
  /** Optional: tool choice strategy */
src/translation/anthropic-to-codex.ts CHANGED
@@ -8,7 +8,7 @@ import type {
8
  CodexInputItem,
9
  CodexContentPart,
10
  } from "../proxy/codex-api.js";
11
- import { resolveModelId, getModelInfo } from "../models/model-store.js";
12
  import { getConfig } from "../config.js";
13
  import { buildInstructions, budgetToEffort } from "./shared-utils.js";
14
  import { anthropicToolsToCodex, anthropicToolChoiceToCodex } from "./tool-format.js";
@@ -180,8 +180,9 @@ export function translateAnthropicToCodexRequest(
180
  input.push({ role: "user", content: "" });
181
  }
182
 
183
- // Resolve model
184
- const modelId = resolveModelId(req.model);
 
185
  const modelInfo = getModelInfo(modelId);
186
  const config = getConfig();
187
 
@@ -204,13 +205,23 @@ export function translateAnthropicToCodexRequest(
204
  request.tool_choice = codexToolChoice;
205
  }
206
 
207
- // Always request reasoning summary (translation layer filters output on demand)
208
  const thinkingEffort = mapThinkingToEffort(req.thinking);
209
  const effort =
210
  thinkingEffort ??
 
211
  modelInfo?.defaultReasoningEffort ??
212
  config.model.default_reasoning_effort;
213
  request.reasoning = { summary: "auto", ...(effort ? { effort } : {}) };
214
 
 
 
 
 
 
 
 
 
 
215
  return request;
216
  }
 
8
  CodexInputItem,
9
  CodexContentPart,
10
  } from "../proxy/codex-api.js";
11
+ import { parseModelName, getModelInfo } from "../models/model-store.js";
12
  import { getConfig } from "../config.js";
13
  import { buildInstructions, budgetToEffort } from "./shared-utils.js";
14
  import { anthropicToolsToCodex, anthropicToolChoiceToCodex } from "./tool-format.js";
 
180
  input.push({ role: "user", content: "" });
181
  }
182
 
183
+ // Resolve model (suffix parsing extracts service_tier and reasoning_effort)
184
+ const parsed = parseModelName(req.model);
185
+ const modelId = parsed.modelId;
186
  const modelInfo = getModelInfo(modelId);
187
  const config = getConfig();
188
 
 
205
  request.tool_choice = codexToolChoice;
206
  }
207
 
208
+ // Reasoning effort: thinking config > suffix > model default > config default
209
  const thinkingEffort = mapThinkingToEffort(req.thinking);
210
  const effort =
211
  thinkingEffort ??
212
+ parsed.reasoningEffort ??
213
  modelInfo?.defaultReasoningEffort ??
214
  config.model.default_reasoning_effort;
215
  request.reasoning = { summary: "auto", ...(effort ? { effort } : {}) };
216
 
217
+ // Service tier: suffix > config default
218
+ const serviceTier =
219
+ parsed.serviceTier ??
220
+ config.model.default_service_tier ??
221
+ null;
222
+ if (serviceTier) {
223
+ request.service_tier = serviceTier;
224
+ }
225
+
226
  return request;
227
  }
src/translation/gemini-to-codex.ts CHANGED
@@ -12,7 +12,7 @@ import type {
12
  CodexInputItem,
13
  CodexContentPart,
14
  } from "../proxy/codex-api.js";
15
- import { resolveModelId, getModelInfo } from "../models/model-store.js";
16
  import { getConfig } from "../config.js";
17
  import { buildInstructions, budgetToEffort } from "./shared-utils.js";
18
  import { geminiToolsToCodex, geminiToolConfigToCodex } from "./tool-format.js";
@@ -187,8 +187,9 @@ export function translateGeminiToCodexRequest(
187
  input.push({ role: "user", content: "" });
188
  }
189
 
190
- // Resolve model
191
- const modelId = resolveModelId(geminiModel);
 
192
  const modelInfo = getModelInfo(modelId);
193
  const config = getConfig();
194
 
@@ -211,15 +212,25 @@ export function translateGeminiToCodexRequest(
211
  request.tool_choice = codexToolChoice;
212
  }
213
 
214
- // Always request reasoning summary (translation layer filters output on demand)
215
  const thinkingEffort = budgetToEffort(
216
  req.generationConfig?.thinkingConfig?.thinkingBudget,
217
  );
218
  const effort =
219
  thinkingEffort ??
 
220
  modelInfo?.defaultReasoningEffort ??
221
  config.model.default_reasoning_effort;
222
  request.reasoning = { summary: "auto", ...(effort ? { effort } : {}) };
223
 
 
 
 
 
 
 
 
 
 
224
  return request;
225
  }
 
12
  CodexInputItem,
13
  CodexContentPart,
14
  } from "../proxy/codex-api.js";
15
+ import { parseModelName, getModelInfo } from "../models/model-store.js";
16
  import { getConfig } from "../config.js";
17
  import { buildInstructions, budgetToEffort } from "./shared-utils.js";
18
  import { geminiToolsToCodex, geminiToolConfigToCodex } from "./tool-format.js";
 
187
  input.push({ role: "user", content: "" });
188
  }
189
 
190
+ // Resolve model (suffix parsing extracts service_tier and reasoning_effort)
191
+ const parsed = parseModelName(geminiModel);
192
+ const modelId = parsed.modelId;
193
  const modelInfo = getModelInfo(modelId);
194
  const config = getConfig();
195
 
 
212
  request.tool_choice = codexToolChoice;
213
  }
214
 
215
+ // Reasoning effort: thinking config > suffix > model default > config default
216
  const thinkingEffort = budgetToEffort(
217
  req.generationConfig?.thinkingConfig?.thinkingBudget,
218
  );
219
  const effort =
220
  thinkingEffort ??
221
+ parsed.reasoningEffort ??
222
  modelInfo?.defaultReasoningEffort ??
223
  config.model.default_reasoning_effort;
224
  request.reasoning = { summary: "auto", ...(effort ? { effort } : {}) };
225
 
226
+ // Service tier: suffix > config default
227
+ const serviceTier =
228
+ parsed.serviceTier ??
229
+ config.model.default_service_tier ??
230
+ null;
231
+ if (serviceTier) {
232
+ request.service_tier = serviceTier;
233
+ }
234
+
235
  return request;
236
  }
src/translation/openai-to-codex.ts CHANGED
@@ -8,7 +8,7 @@ import type {
8
  CodexInputItem,
9
  CodexContentPart,
10
  } from "../proxy/codex-api.js";
11
- import { resolveModelId, getModelInfo } from "../models/model-store.js";
12
  import { getConfig } from "../config.js";
13
  import { buildInstructions } from "./shared-utils.js";
14
  import {
@@ -145,8 +145,9 @@ export function translateToCodexRequest(
145
  input.push({ role: "user", content: "" });
146
  }
147
 
148
- // Resolve model
149
- const modelId = resolveModelId(req.model);
 
150
  const modelInfo = getModelInfo(modelId);
151
  const config = getConfig();
152
 
@@ -173,12 +174,23 @@ export function translateToCodexRequest(
173
  request.tool_choice = codexToolChoice;
174
  }
175
 
176
- // Always request reasoning summary (translation layer filters output on demand)
177
  const effort =
178
  req.reasoning_effort ??
 
179
  modelInfo?.defaultReasoningEffort ??
180
  config.model.default_reasoning_effort;
181
  request.reasoning = { summary: "auto", ...(effort ? { effort } : {}) };
182
 
 
 
 
 
 
 
 
 
 
 
183
  return request;
184
  }
 
8
  CodexInputItem,
9
  CodexContentPart,
10
  } from "../proxy/codex-api.js";
11
+ import { parseModelName, getModelInfo } from "../models/model-store.js";
12
  import { getConfig } from "../config.js";
13
  import { buildInstructions } from "./shared-utils.js";
14
  import {
 
145
  input.push({ role: "user", content: "" });
146
  }
147
 
148
+ // Resolve model (suffix parsing extracts service_tier and reasoning_effort)
149
+ const parsed = parseModelName(req.model);
150
+ const modelId = parsed.modelId;
151
  const modelInfo = getModelInfo(modelId);
152
  const config = getConfig();
153
 
 
174
  request.tool_choice = codexToolChoice;
175
  }
176
 
177
+ // Reasoning effort: explicit API field > suffix > model default > config default
178
  const effort =
179
  req.reasoning_effort ??
180
+ parsed.reasoningEffort ??
181
  modelInfo?.defaultReasoningEffort ??
182
  config.model.default_reasoning_effort;
183
  request.reasoning = { summary: "auto", ...(effort ? { effort } : {}) };
184
 
185
+ // Service tier: explicit API field > suffix > config default
186
+ const serviceTier =
187
+ req.service_tier ??
188
+ parsed.serviceTier ??
189
+ config.model.default_service_tier ??
190
+ null;
191
+ if (serviceTier) {
192
+ request.service_tier = serviceTier;
193
+ }
194
+
195
  return request;
196
  }
src/types/openai.ts CHANGED
@@ -45,6 +45,7 @@ export const ChatCompletionRequestSchema = z.object({
45
  user: z.string().optional(),
46
  // Codex-specific extensions
47
  reasoning_effort: z.enum(["low", "medium", "high", "xhigh"]).optional(),
 
48
  // New tool format (accepted for compatibility, not forwarded to Codex)
49
  tools: z.array(z.object({
50
  type: z.literal("function"),
 
45
  user: z.string().optional(),
46
  // Codex-specific extensions
47
  reasoning_effort: z.enum(["low", "medium", "high", "xhigh"]).optional(),
48
+ service_tier: z.enum(["fast", "flex"]).nullable().optional(),
49
  // New tool format (accepted for compatibility, not forwarded to Codex)
50
  tools: z.array(z.object({
51
  type: z.literal("function"),
web/src/App.tsx CHANGED
@@ -106,16 +106,21 @@ function Dashboard() {
106
  modelFamilies={status.modelFamilies}
107
  selectedEffort={status.selectedEffort}
108
  onEffortChange={status.setSelectedEffort}
 
 
109
  />
110
  <AnthropicSetup
111
  apiKey={status.apiKey}
112
  selectedModel={status.selectedModel}
 
 
113
  />
114
  <CodeExamples
115
  baseUrl={status.baseUrl}
116
  apiKey={status.apiKey}
117
  model={status.selectedModel}
118
  reasoningEffort={status.selectedEffort}
 
119
  />
120
  </div>
121
  </main>
 
106
  modelFamilies={status.modelFamilies}
107
  selectedEffort={status.selectedEffort}
108
  onEffortChange={status.setSelectedEffort}
109
+ selectedSpeed={status.selectedSpeed}
110
+ onSpeedChange={status.setSelectedSpeed}
111
  />
112
  <AnthropicSetup
113
  apiKey={status.apiKey}
114
  selectedModel={status.selectedModel}
115
+ reasoningEffort={status.selectedEffort}
116
+ serviceTier={status.selectedSpeed}
117
  />
118
  <CodeExamples
119
  baseUrl={status.baseUrl}
120
  apiKey={status.apiKey}
121
  model={status.selectedModel}
122
  reasoningEffort={status.selectedEffort}
123
+ serviceTier={status.selectedSpeed}
124
  />
125
  </div>
126
  </main>
web/src/components/AnthropicSetup.tsx CHANGED
@@ -5,18 +5,28 @@ import { CopyButton } from "./CopyButton";
5
  interface AnthropicSetupProps {
6
  apiKey: string;
7
  selectedModel: string;
 
 
8
  }
9
 
10
- export function AnthropicSetup({ apiKey, selectedModel }: AnthropicSetupProps) {
11
  const t = useT();
12
 
13
  const origin = typeof window !== "undefined" ? window.location.origin : "http://localhost:8080";
14
 
 
 
 
 
 
 
 
 
15
  const envLines = useMemo(() => ({
16
  ANTHROPIC_BASE_URL: origin,
17
  ANTHROPIC_API_KEY: apiKey,
18
- ANTHROPIC_MODEL: selectedModel,
19
- }), [origin, apiKey, selectedModel]);
20
 
21
  const allEnvText = useMemo(
22
  () => Object.entries(envLines).map(([k, v]) => `${k}=${v}`).join("\n"),
 
5
  interface AnthropicSetupProps {
6
  apiKey: string;
7
  selectedModel: string;
8
+ reasoningEffort: string;
9
+ serviceTier: string | null;
10
  }
11
 
12
+ export function AnthropicSetup({ apiKey, selectedModel, reasoningEffort, serviceTier }: AnthropicSetupProps) {
13
  const t = useT();
14
 
15
  const origin = typeof window !== "undefined" ? window.location.origin : "http://localhost:8080";
16
 
17
+ // Build compound model name with suffixes
18
+ const displayModel = useMemo(() => {
19
+ let name = selectedModel;
20
+ if (reasoningEffort && reasoningEffort !== "medium") name += `-${reasoningEffort}`;
21
+ if (serviceTier === "fast") name += "-fast";
22
+ return name;
23
+ }, [selectedModel, reasoningEffort, serviceTier]);
24
+
25
  const envLines = useMemo(() => ({
26
  ANTHROPIC_BASE_URL: origin,
27
  ANTHROPIC_API_KEY: apiKey,
28
+ ANTHROPIC_MODEL: displayModel,
29
+ }), [origin, apiKey, displayModel]);
30
 
31
  const allEnvText = useMemo(
32
  () => Object.entries(envLines).map(([k, v]) => `${k}=${v}`).join("\n"),
web/src/components/ApiConfig.tsx CHANGED
@@ -12,6 +12,8 @@ interface ApiConfigProps {
12
  modelFamilies: ModelFamily[];
13
  selectedEffort: string;
14
  onEffortChange: (effort: string) => void;
 
 
15
  }
16
 
17
  const EFFORT_LABELS: Record<string, string> = {
@@ -32,6 +34,8 @@ export function ApiConfig({
32
  modelFamilies,
33
  selectedEffort,
34
  onEffortChange,
 
 
35
  }: ApiConfigProps) {
36
  const t = useT();
37
 
@@ -151,6 +155,30 @@ export function ApiConfig({
151
  ))}
152
  </div>
153
  )}
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
154
  </div>
155
  ) : (
156
  <div class="relative">
 
12
  modelFamilies: ModelFamily[];
13
  selectedEffort: string;
14
  onEffortChange: (effort: string) => void;
15
+ selectedSpeed: string | null;
16
+ onSpeedChange: (speed: string | null) => void;
17
  }
18
 
19
  const EFFORT_LABELS: Record<string, string> = {
 
34
  modelFamilies,
35
  selectedEffort,
36
  onEffortChange,
37
+ selectedSpeed,
38
+ onSpeedChange,
39
  }: ApiConfigProps) {
40
  const t = useT();
41
 
 
155
  ))}
156
  </div>
157
  )}
158
+ {/* Speed toggle — Standard / Fast */}
159
+ <div class="flex items-center gap-1.5 mt-2">
160
+ <span class="text-[0.68rem] font-medium text-slate-500 dark:text-text-dim mr-1">{t("speed")}</span>
161
+ <button
162
+ onClick={() => onSpeedChange(null)}
163
+ class={`px-2.5 py-1 text-[0.7rem] font-semibold rounded transition-all ${
164
+ selectedSpeed === null
165
+ ? "bg-primary text-white shadow-sm"
166
+ : "bg-white dark:bg-[#21262d] text-slate-600 dark:text-text-dim border border-gray-200 dark:border-border-dark hover:border-primary/50"
167
+ }`}
168
+ >
169
+ {t("speedStandard")}
170
+ </button>
171
+ <button
172
+ onClick={() => onSpeedChange("fast")}
173
+ class={`px-2.5 py-1 text-[0.7rem] font-semibold rounded transition-all ${
174
+ selectedSpeed === "fast"
175
+ ? "bg-primary text-white shadow-sm"
176
+ : "bg-white dark:bg-[#21262d] text-slate-600 dark:text-text-dim border border-gray-200 dark:border-border-dark hover:border-primary/50"
177
+ }`}
178
+ >
179
+ {t("speedFast")}
180
+ </button>
181
+ </div>
182
  </div>
183
  ) : (
184
  <div class="relative">
web/src/components/CodeExamples.tsx CHANGED
@@ -148,17 +148,29 @@ interface CodeExamplesProps {
148
  apiKey: string;
149
  model: string;
150
  reasoningEffort: string;
 
151
  }
152
 
153
- export function CodeExamples({ baseUrl, apiKey, model, reasoningEffort }: CodeExamplesProps) {
154
  const t = useT();
155
  const [protocol, setProtocol] = useState<Protocol>("openai");
156
  const [codeLang, setCodeLang] = useState<CodeLang>("python");
157
 
158
  const origin = typeof window !== "undefined" ? window.location.origin : "";
 
 
 
 
 
 
 
 
 
 
 
159
  const examples = useMemo(
160
- () => buildExamples(baseUrl, apiKey, model, origin, reasoningEffort),
161
- [baseUrl, apiKey, model, origin, reasoningEffort]
162
  );
163
 
164
  const currentCode = examples[`${protocol}-${codeLang}`] || "Loading...";
 
148
  apiKey: string;
149
  model: string;
150
  reasoningEffort: string;
151
+ serviceTier: string | null;
152
  }
153
 
154
+ export function CodeExamples({ baseUrl, apiKey, model, reasoningEffort, serviceTier }: CodeExamplesProps) {
155
  const t = useT();
156
  const [protocol, setProtocol] = useState<Protocol>("openai");
157
  const [codeLang, setCodeLang] = useState<CodeLang>("python");
158
 
159
  const origin = typeof window !== "undefined" ? window.location.origin : "";
160
+
161
+ // Build compound model name with suffixes for CLI users
162
+ const displayModel = useMemo(() => {
163
+ let name = model;
164
+ if (reasoningEffort && reasoningEffort !== "medium") name += `-${reasoningEffort}`;
165
+ if (serviceTier === "fast") name += "-fast";
166
+ return name;
167
+ }, [model, reasoningEffort, serviceTier]);
168
+
169
+ // When effort/speed are embedded as suffixes, don't also show separate reasoning_effort param
170
+ const explicitEffort = displayModel === model ? reasoningEffort : "medium";
171
  const examples = useMemo(
172
+ () => buildExamples(baseUrl, apiKey, displayModel, origin, explicitEffort),
173
+ [baseUrl, apiKey, displayModel, origin, explicitEffort]
174
  );
175
 
176
  const currentCode = examples[`${protocol}-${codeLang}`] || "Loading...";