Mayo commited on
Commit
b788630
·
unverified ·
1 Parent(s): b788834

docs: add codex

Browse files
README.md CHANGED
@@ -39,6 +39,7 @@ Under the hood, Koharu uses [candle](https://github.com/huggingface/candle) and
39
  - Inpainting to remove source lettering from the page
40
  - Translation with local or remote LLM backends
41
  - Advanced text rendering with vertical CJK and RTL support
 
42
  - Layered PSD export with editable text
43
  - Local HTTP API and MCP server for automation
44
 
@@ -255,6 +256,14 @@ Koharu supports hosted APIs from [OpenAI](https://platform.openai.com/), [Gemini
255
 
256
  Built-in cloud defaults: OpenAI `gpt-5-mini`, Gemini `gemini-3.1-flash-lite-preview`, Claude `claude-haiku-4-5`, and DeepSeek `deepseek-chat`.
257
 
 
 
 
 
 
 
 
 
258
  #### Machine Translation Providers
259
 
260
  For pure machine-translation use cases, Koharu also supports [DeepL](https://www.deepl.com/), [Google Cloud Translation](https://cloud.google.com/translate), and [Caiyun](https://fanyi.caiyunapp.com/). These providers translate without an LLM-style chat or system prompt; you provide an API key and Koharu uses the upstream translate endpoint directly.
 
39
  - Inpainting to remove source lettering from the page
40
  - Translation with local or remote LLM backends
41
  - Advanced text rendering with vertical CJK and RTL support
42
+ - Codex image-to-image generation for end-to-end page redraws from a source image and prompt
43
  - Layered PSD export with editable text
44
  - Local HTTP API and MCP server for automation
45
 
 
256
 
257
  Built-in cloud defaults: OpenAI `gpt-5-mini`, Gemini `gemini-3.1-flash-lite-preview`, Claude `claude-haiku-4-5`, and DeepSeek `deepseek-chat`.
258
 
259
+ #### Codex Image-to-Image Generation
260
+
261
+ Koharu can use Codex for end-to-end image-to-image generation. This workflow sends the current source page image plus a user prompt to Codex, then stores the generated image as a rendered page result.
262
+
263
+ This feature requires a ChatGPT account with Codex access. Two-factor authentication must be enabled on the account before device-code login can complete successfully.
264
+
265
+ Codex image generation is useful when you want the model to translate visible text, remove the original lettering, and redraw the page in one pass. Because the image request is processed by the ChatGPT Codex backend, failures can include upstream OpenAI request IDs and may need to be retried.
266
+
267
  #### Machine Translation Providers
268
 
269
  For pure machine-translation use cases, Koharu also supports [DeepL](https://www.deepl.com/), [Google Cloud Translation](https://cloud.google.com/translate), and [Caiyun](https://fanyi.caiyunapp.com/). These providers translate without an LLM-style chat or system prompt; you provide an API key and Koharu uses the upstream translate endpoint directly.
docs/en-US/explanation/models-and-providers.md CHANGED
@@ -112,6 +112,12 @@ Remote providers are configured in **Settings > API Keys**.
112
 
113
  For a step-by-step setup guide for LM Studio, OpenRouter, and similar endpoints, see [Use OpenAI-Compatible APIs](../how-to/use-openai-compatible-api.md).
114
 
 
 
 
 
 
 
115
  ## Choosing between local and remote
116
 
117
  Use local models when you want:
 
112
 
113
  For a step-by-step setup guide for LM Studio, OpenRouter, and similar endpoints, see [Use OpenAI-Compatible APIs](../how-to/use-openai-compatible-api.md).
114
 
115
+ ### Codex image generation
116
+
117
+ Koharu can also use Codex for end-to-end image-to-image generation. Instead of translating text blocks and rendering text locally as separate steps, this workflow sends the source page image and prompt to Codex and receives a generated page image.
118
+
119
+ This is a remote image-generation workflow, not a local model. It requires a ChatGPT account with Codex access and two-factor authentication enabled before device-code login can complete. See [Use Codex Image Generation](../how-to/use-codex-image-generation.md) for usage notes and caveats.
120
+
121
  ## Choosing between local and remote
122
 
123
  Use local models when you want:
docs/en-US/how-to/index.md CHANGED
@@ -13,6 +13,7 @@ How-to guides focus on concrete tasks you may want to complete with Koharu.
13
  - [Run GUI, Headless, and MCP Modes](run-gui-headless-and-mcp.md): local deployment patterns and runtime flags
14
  - [Configure MCP Clients](configure-mcp-clients.md): connect Antigravity, Claude Desktop, or Claude Code to Koharu's local MCP endpoint
15
  - [Use OpenAI-Compatible APIs](use-openai-compatible-api.md): connect LM Studio, OpenRouter, and other OpenAI-style chat-completions endpoints
 
16
  - [Export Pages and Manage Projects](export-and-manage-projects.md): rendered images, PSD handoff, and page-set management
17
  - [Build From Source](build-from-source.md): local build flow with Bun, Tauri, and platform features
18
  - [Troubleshooting](troubleshooting.md): common startup, download, GPU, pipeline, and connectivity failures
 
13
  - [Run GUI, Headless, and MCP Modes](run-gui-headless-and-mcp.md): local deployment patterns and runtime flags
14
  - [Configure MCP Clients](configure-mcp-clients.md): connect Antigravity, Claude Desktop, or Claude Code to Koharu's local MCP endpoint
15
  - [Use OpenAI-Compatible APIs](use-openai-compatible-api.md): connect LM Studio, OpenRouter, and other OpenAI-style chat-completions endpoints
16
+ - [Use Codex Image Generation](use-codex-image-generation.md): use Codex for end-to-end image-to-image page redraws
17
  - [Export Pages and Manage Projects](export-and-manage-projects.md): rendered images, PSD handoff, and page-set management
18
  - [Build From Source](build-from-source.md): local build flow with Bun, Tauri, and platform features
19
  - [Troubleshooting](troubleshooting.md): common startup, download, GPU, pipeline, and connectivity failures
docs/en-US/how-to/use-codex-image-generation.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Use Codex Image Generation
3
+ ---
4
+
5
+ # Use Codex Image Generation
6
+
7
+ Koharu can use Codex for end-to-end image-to-image generation. The workflow sends a source page image and a prompt to Codex, then stores the generated image as a rendered page result.
8
+
9
+ ## Requirements
10
+
11
+ - a ChatGPT account with Codex access
12
+ - two-factor authentication enabled on that account
13
+ - network access to OpenAI and ChatGPT services
14
+
15
+ Two-factor authentication is required before device-code login can complete successfully.
16
+
17
+ ## What the feature does
18
+
19
+ Codex image-to-image generation is a full-page redraw workflow. It can use the source image and prompt to:
20
+
21
+ - translate visible text
22
+ - remove original lettering
23
+ - redraw edited regions
24
+ - preserve panel layout, speech bubbles, tone, and composition
25
+ - produce a generated page image in one pass
26
+
27
+ This is separate from Koharu's staged local pipeline, where detection, OCR, inpainting, translation, and rendering run as individual steps. The Codex workflow sends the page image to a remote service and receives a generated image result.
28
+
29
+ ## Prompting
30
+
31
+ Use a prompt that describes the complete page-level result you want. For example:
32
+
33
+ ```text
34
+ Translate all visible text to natural English, remove the original lettering,
35
+ and redraw the page as a clean manga image while preserving the artwork,
36
+ panel layout, speech bubbles, tone, and composition.
37
+ ```
38
+
39
+ For narrower edits, describe the target change and what must be preserved. The model receives the source page image, so the prompt should focus on transformation goals rather than restating every visual detail.
40
+
41
+ ## Privacy and reliability
42
+
43
+ This feature sends the source page image and prompt to the ChatGPT Codex backend. Use the local pipeline instead when you need offline processing or do not want to send page images to a remote provider.
44
+
45
+ Codex image generation depends on OpenAI's upstream service. If generation fails, Koharu surfaces the upstream response text and request ID when available. Retrying can succeed if the failure is transient. Persistent failures may indicate account access, service availability, or backend support limitations for image-generation tool calls.
46
+
47
+ ## When to use it
48
+
49
+ Use Codex image generation when you want a fast end-to-end redraw and are comfortable with a remote model rewriting the final image.
50
+
51
+ Use the local staged pipeline when you want more control over intermediate OCR, cleanup masks, translation text, fonts, and editable output.
docs/ja-JP/explanation/models-and-providers.md CHANGED
@@ -112,6 +112,12 @@ LLM ベースのプロバイダで現在の組み込み既定値は次の通り
112
 
113
  LM Studio、OpenRouter、類似エンドポイントの具体的な設定手順は [OpenAI 互換 API を使う](../how-to/use-openai-compatible-api.md) を参照してください。
114
 
 
 
 
 
 
 
115
  ## ローカルとリモートをどう選ぶか
116
 
117
  ローカルモデルが向くケース:
 
112
 
113
  LM Studio、OpenRouter、類似エンドポイントの具体的な設定手順は [OpenAI 互換 API を使う](../how-to/use-openai-compatible-api.md) を参照してください。
114
 
115
+ ### Codex 画像生成
116
+
117
+ Koharu は Codex を使ったエンドツーエンドの image-to-image 生成にも対応しています。テキストブロックの翻訳とローカルレンダリングを別々の手順として行う代わりに、このワークフローでは元ページ画像とプロンプトを Codex に送り、生成されたページ画像を受け取ります。
118
+
119
+ これはローカルモデルではなく、リモート画像生成ワークフローです。Codex にアクセスできる ChatGPT アカウントと、デバイスコードログインを完了するための 2 要素認証が必要です。利用上の注意と制限は [Codex 画像生成を使う](../how-to/use-codex-image-generation.md) を参照してください。
120
+
121
  ## ローカルとリモートをどう選ぶか
122
 
123
  ローカルモデルが向くケース:
docs/ja-JP/how-to/index.md CHANGED
@@ -13,6 +13,7 @@ title: ハウツーガイド
13
  - [GUI / Headless / MCP モードを使う](run-gui-headless-and-mcp.md): ローカルでの起動パターンと実行時フラグ
14
  - [MCP クライアントを設定する](configure-mcp-clients.md): Antigravity、Claude Desktop、Claude Code を Koharu のローカル MCP エンドポイントに接続する
15
  - [OpenAI 互換 API を使う](use-openai-compatible-api.md): LM Studio、OpenRouter、その他 OpenAI 形式の chat-completions エンドポイントを接続する
 
16
  - [ページを書き出し、プロジェクトを管理する](export-and-manage-projects.md): レンダリング済み画像、PSD 引き渡し、ページセット管理
17
  - [ソースからビルドする](build-from-source.md): Bun、Tauri、プラットフォーム機能を使ったローカルビルド手順
18
  - [トラブルシューティング](troubleshooting.md): 起動、ダウンロード、GPU、パイプライン、接続まわりの典型的な問題
 
13
  - [GUI / Headless / MCP モードを使う](run-gui-headless-and-mcp.md): ローカルでの起動パターンと実行時フラグ
14
  - [MCP クライアントを設定する](configure-mcp-clients.md): Antigravity、Claude Desktop、Claude Code を Koharu のローカル MCP エンドポイントに接続する
15
  - [OpenAI 互換 API を使う](use-openai-compatible-api.md): LM Studio、OpenRouter、その他 OpenAI 形式の chat-completions エンドポイントを接続する
16
+ - [Codex 画像生成を使う](use-codex-image-generation.md): Codex で image-to-image のページ全体再描画を行う
17
  - [ページを書き出し、プロジェクトを管理する](export-and-manage-projects.md): レンダリング済み画像、PSD 引き渡し、ページセット管理
18
  - [ソースからビルドする](build-from-source.md): Bun、Tauri、プラットフォーム機能を使ったローカルビルド手順
19
  - [トラブルシューティング](troubleshooting.md): 起動、ダウンロード、GPU、パイプライン、接続まわりの典型的な問題
docs/ja-JP/how-to/use-codex-image-generation.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Codex 画像生成を使う
3
+ ---
4
+
5
+ # Codex 画像生成を使う
6
+
7
+ Koharu は Codex を使ったエンドツーエンドの image-to-image 生成に対応しています。このワークフローでは、元ページ画像とプロンプトを Codex に送り、生成された画像をレンダリング済みページ結果として保存します。
8
+
9
+ ## 要件
10
+
11
+ - Codex にアクセスできる ChatGPT アカウント
12
+ - そのアカウントで有効化された 2 要素認証
13
+ - OpenAI と ChatGPT サービスへ接続できるネットワーク
14
+
15
+ デバイスコードログインを正常に完了するには、事前に 2 要素認証を有効にしておく必要があります。
16
+
17
+ ## この機能でできること
18
+
19
+ Codex の image-to-image 生成は、ページ全体を描き直すワークフローです。元画像とプロンプトを使って、次のような処理を 1 回で行えます。
20
+
21
+ - 表示されている文字を翻訳する
22
+ - 元の文字を消す
23
+ - 編集された領域を描き直す
24
+ - コマ割り、吹き出し、トーン、構図を保つ
25
+ - 生成済みのページ画像を出力する
26
+
27
+ これは Koharu の段階的なローカルパイプラインとは別の機能です。ローカルパイプラインでは、検出、OCR、インペイント、翻訳、レンダリングを個別のステップとして実行します。Codex ワークフローでは、ページ画像をリモートサービスへ送り、生成画像を受け取ります。
28
+
29
+ ## プロンプト
30
+
31
+ ページ全体としてほしい結果を説明するプロンプトを使ってください。例:
32
+
33
+ ```text
34
+ Translate all visible text to natural English, remove the original lettering,
35
+ and redraw the page as a clean manga image while preserving the artwork,
36
+ panel layout, speech bubbles, tone, and composition.
37
+ ```
38
+
39
+ より狭い編集では、変更したい内容と維持したい要素を明確に書きます。モデルには元ページ画像も渡されるため、プロンプトでは細部をすべて説明するよりも、変換の目的を中心に書くと扱いやすくなります。
40
+
41
+ ## プライバシーと信頼性
42
+
43
+ この機能は、元ページ画像とプロンプトを ChatGPT Codex バックエンドへ送信します。オフライン処理が必要な場合や、ページ画像をリモートプロバイダーへ送信したくない場合は、ローカルパイプラインを使用してください。
44
+
45
+ Codex 画像生成は OpenAI の上流サービスに依存します。生成に失敗した場合、利用可能であれば Koharu は上流の応答テキストとリクエスト ID を表示します。一時的な失敗であれば再試行で成功することがあります。失敗が続く場合は、アカウントのアクセス権、サービスの可用性、または画像生成ツール呼び出しに対するバックエンド側の対応状況が原因の可能性があります。
46
+
47
+ ## 使い分け
48
+
49
+ リモートモデルで最終画像を一括生成したい場合は、Codex 画像生成を使います。
50
+
51
+ 中間の OCR、クリーンアップマスク、翻訳テキスト、フォント、編集可能な出力を細かく制御したい場合は、ローカルの段階的パイプラインを使います。
docs/pt-BR/explanation/models-and-providers.md CHANGED
@@ -112,6 +112,12 @@ Os provedores remotos são configurados em **Configurações > Chaves de API**.
112
 
113
  Para um guia passo a passo de configuração para LM Studio, OpenRouter e endpoints similares, veja [Usar APIs Compatíveis com OpenAI](../how-to/use-openai-compatible-api.md).
114
 
 
 
 
 
 
 
115
  ## Escolhendo entre local e remoto
116
 
117
  Use modelos locais quando você quer:
 
112
 
113
  Para um guia passo a passo de configuração para LM Studio, OpenRouter e endpoints similares, veja [Usar APIs Compatíveis com OpenAI](../how-to/use-openai-compatible-api.md).
114
 
115
+ ### Geração de imagem com Codex
116
+
117
+ O Koharu também pode usar o Codex para geração image-to-image de ponta a ponta. Em vez de traduzir blocos de texto e renderizar texto localmente como etapas separadas, esse fluxo envia a imagem de página de origem e o prompt ao Codex e recebe uma imagem de página gerada.
118
+
119
+ Esse é um fluxo remoto de geração de imagem, não um modelo local. Ele exige uma conta ChatGPT com acesso ao Codex e autenticação de dois fatores habilitada para concluir o login por código de dispositivo. Consulte [Usar Geração de Imagem com Codex](../how-to/use-codex-image-generation.md) para notas de uso e limitações.
120
+
121
  ## Escolhendo entre local e remoto
122
 
123
  Use modelos locais quando você quer:
docs/pt-BR/how-to/index.md CHANGED
@@ -13,6 +13,7 @@ Os guias práticos tratam de tarefas concretas que você pode querer realizar co
13
  - [Executar nos Modos GUI, Headless e MCP](run-gui-headless-and-mcp.md): padrões de deploy local e flags de runtime
14
  - [Configurar Clientes MCP](configure-mcp-clients.md): conectar Antigravity, Claude Desktop ou Claude Code ao endpoint MCP local do Koharu
15
  - [Usar APIs Compatíveis com OpenAI](use-openai-compatible-api.md): conectar LM Studio, OpenRouter e outros endpoints de chat-completions no formato OpenAI
 
16
  - [Exportar Páginas e Gerenciar Projetos](export-and-manage-projects.md): imagens renderizadas, entrega em PSD e gerenciamento de conjuntos de páginas
17
  - [Build a Partir do Código-Fonte](build-from-source.md): fluxo de build local com Bun, Tauri e features de plataforma
18
  - [Troubleshooting](troubleshooting.md): falhas comuns de inicialização, download, GPU, pipeline e conectividade
 
13
  - [Executar nos Modos GUI, Headless e MCP](run-gui-headless-and-mcp.md): padrões de deploy local e flags de runtime
14
  - [Configurar Clientes MCP](configure-mcp-clients.md): conectar Antigravity, Claude Desktop ou Claude Code ao endpoint MCP local do Koharu
15
  - [Usar APIs Compatíveis com OpenAI](use-openai-compatible-api.md): conectar LM Studio, OpenRouter e outros endpoints de chat-completions no formato OpenAI
16
+ - [Usar Geração de Imagem com Codex](use-codex-image-generation.md): usar o Codex para redesenho image-to-image de páginas inteiras
17
  - [Exportar Páginas e Gerenciar Projetos](export-and-manage-projects.md): imagens renderizadas, entrega em PSD e gerenciamento de conjuntos de páginas
18
  - [Build a Partir do Código-Fonte](build-from-source.md): fluxo de build local com Bun, Tauri e features de plataforma
19
  - [Troubleshooting](troubleshooting.md): falhas comuns de inicialização, download, GPU, pipeline e conectividade
docs/pt-BR/how-to/use-codex-image-generation.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: Usar Geração de Imagem com Codex
3
+ ---
4
+
5
+ # Usar Geração de Imagem com Codex
6
+
7
+ O Koharu pode usar o Codex para geração image-to-image de ponta a ponta. Esse fluxo envia uma imagem de página de origem e um prompt ao Codex, depois salva a imagem gerada como resultado renderizado da página.
8
+
9
+ ## Requisitos
10
+
11
+ - uma conta ChatGPT com acesso ao Codex
12
+ - autenticação de dois fatores habilitada nessa conta
13
+ - acesso de rede aos serviços da OpenAI e do ChatGPT
14
+
15
+ A autenticação de dois fatores precisa estar habilitada antes que o login por código de dispositivo possa ser concluído com sucesso.
16
+
17
+ ## O que o recurso faz
18
+
19
+ A geração image-to-image do Codex é um fluxo de redesenho de página inteira. Ela pode usar a imagem de origem e o prompt para:
20
+
21
+ - traduzir o texto visível
22
+ - remover as letras originais
23
+ - redesenhar regiões editadas
24
+ - preservar layout dos painéis, balões, retículas e composição
25
+ - produzir uma imagem de página gerada em uma única passagem
26
+
27
+ Isso é separado do pipeline local em etapas do Koharu, no qual detecção, OCR, inpainting, tradução e renderização rodam como passos individuais. O fluxo do Codex envia a imagem da página para um serviço remoto e recebe uma imagem gerada como resultado.
28
+
29
+ ## Prompt
30
+
31
+ Use um prompt que descreva o resultado final desejado para a página inteira. Por exemplo:
32
+
33
+ ```text
34
+ Translate all visible text to natural English, remove the original lettering,
35
+ and redraw the page as a clean manga image while preserving the artwork,
36
+ panel layout, speech bubbles, tone, and composition.
37
+ ```
38
+
39
+ Para edições mais estreitas, descreva a alteração desejada e o que precisa ser preservado. Como o modelo recebe a imagem da página de origem, o prompt deve focar nos objetivos de transformação em vez de repetir todos os detalhes visuais.
40
+
41
+ ## Privacidade e confiabilidade
42
+
43
+ Esse recurso envia a imagem da página de origem e o prompt ao backend do ChatGPT Codex. Use o pipeline local quando precisar de processamento offline ou não quiser enviar imagens de páginas para um provedor remoto.
44
+
45
+ A geração de imagem do Codex depende do serviço upstream da OpenAI. Se a geração falhar, o Koharu mostra o texto de resposta upstream e o ID da requisição quando disponíveis. Tentar novamente pode resolver falhas transitórias. Falhas persistentes podem indicar limitações de acesso da conta, disponibilidade do serviço ou suporte do backend para chamadas da ferramenta de geração de imagem.
46
+
47
+ ## Quando usar
48
+
49
+ Use a geração de imagem do Codex quando quiser um redesenho de ponta a ponta rápido e aceitar que um modelo remoto reescreva a imagem final.
50
+
51
+ Use o pipeline local em etapas quando quiser mais controle sobre OCR intermediário, máscaras de limpeza, texto traduzido, fontes e saída editável.
docs/zensical.ja-JP.toml CHANGED
@@ -21,6 +21,7 @@ nav = [
21
  "how-to/run-gui-headless-and-mcp.md",
22
  "how-to/configure-mcp-clients.md",
23
  "how-to/use-openai-compatible-api.md",
 
24
  "how-to/export-and-manage-projects.md",
25
  "how-to/build-from-source.md",
26
  "how-to/troubleshooting.md",
 
21
  "how-to/run-gui-headless-and-mcp.md",
22
  "how-to/configure-mcp-clients.md",
23
  "how-to/use-openai-compatible-api.md",
24
+ "how-to/use-codex-image-generation.md",
25
  "how-to/export-and-manage-projects.md",
26
  "how-to/build-from-source.md",
27
  "how-to/troubleshooting.md",
docs/zensical.pt-BR.toml CHANGED
@@ -21,6 +21,7 @@ nav = [
21
  "how-to/run-gui-headless-and-mcp.md",
22
  "how-to/configure-mcp-clients.md",
23
  "how-to/use-openai-compatible-api.md",
 
24
  "how-to/export-and-manage-projects.md",
25
  "how-to/build-from-source.md",
26
  "how-to/troubleshooting.md",
 
21
  "how-to/run-gui-headless-and-mcp.md",
22
  "how-to/configure-mcp-clients.md",
23
  "how-to/use-openai-compatible-api.md",
24
+ "how-to/use-codex-image-generation.md",
25
  "how-to/export-and-manage-projects.md",
26
  "how-to/build-from-source.md",
27
  "how-to/troubleshooting.md",
docs/zensical.toml CHANGED
@@ -21,6 +21,7 @@ nav = [
21
  "how-to/run-gui-headless-and-mcp.md",
22
  "how-to/configure-mcp-clients.md",
23
  "how-to/use-openai-compatible-api.md",
 
24
  "how-to/export-and-manage-projects.md",
25
  "how-to/build-from-source.md",
26
  "how-to/troubleshooting.md",
 
21
  "how-to/run-gui-headless-and-mcp.md",
22
  "how-to/configure-mcp-clients.md",
23
  "how-to/use-openai-compatible-api.md",
24
+ "how-to/use-codex-image-generation.md",
25
  "how-to/export-and-manage-projects.md",
26
  "how-to/build-from-source.md",
27
  "how-to/troubleshooting.md",
docs/zensical.zh-CN.toml CHANGED
@@ -21,6 +21,7 @@ nav = [
21
  "how-to/run-gui-headless-and-mcp.md",
22
  "how-to/configure-mcp-clients.md",
23
  "how-to/use-openai-compatible-api.md",
 
24
  "how-to/export-and-manage-projects.md",
25
  "how-to/build-from-source.md",
26
  "how-to/troubleshooting.md",
 
21
  "how-to/run-gui-headless-and-mcp.md",
22
  "how-to/configure-mcp-clients.md",
23
  "how-to/use-openai-compatible-api.md",
24
+ "how-to/use-codex-image-generation.md",
25
  "how-to/export-and-manage-projects.md",
26
  "how-to/build-from-source.md",
27
  "how-to/troubleshooting.md",
docs/zh-CN/explanation/models-and-providers.md CHANGED
@@ -112,6 +112,12 @@ LLM 驱动提供商当前内置的默认模型如下:
112
 
113
  如果你需要 LM Studio、OpenRouter 或类似端点的逐步配置说明,请参见 [使用 OpenAI 兼容 API](../how-to/use-openai-compatible-api.md)。
114
 
 
 
 
 
 
 
115
  ## 如何在本地与远程之间选择
116
 
117
  以下情况更适合本地模型:
 
112
 
113
  如果你需要 LM Studio、OpenRouter 或类似端点的逐步配置说明,请参见 [使用 OpenAI 兼容 API](../how-to/use-openai-compatible-api.md)。
114
 
115
+ ### Codex 图像生成
116
+
117
+ Koharu 也可以使用 Codex 进行端到端 image-to-image 生成。它不会把文本块翻译和本地文字渲染作为独立步骤处理,而是把源页面图像和提示词发送给 Codex,并接收生成后的页面图像。
118
+
119
+ 这是远程图像生成流程,不是本地模型。它需要拥有 Codex 访问权限的 ChatGPT 账号,并且必须启用双重身份验证才能完成设备码登录。使用说明和注意事项见 [使用 Codex 图像生成](../how-to/use-codex-image-generation.md)。
120
+
121
  ## 如何在本地与远程之间选择
122
 
123
  以下情况更适合本地模型:
docs/zh-CN/how-to/index.md CHANGED
@@ -13,6 +13,7 @@ title: 操作指南
13
  - [以 GUI、Headless 与 MCP 模式运行](run-gui-headless-and-mcp.md):本地运行模式与运行时参数
14
  - [配置 MCP 客户端](configure-mcp-clients.md):把 Antigravity、Claude Desktop 或 Claude Code 接到本地 MCP 端点
15
  - [使用 OpenAI 兼容 API](use-openai-compatible-api.md):连接 LM Studio、OpenRouter 与其他 OpenAI 风格的接口
 
16
  - [导出页面与管理项目](export-and-manage-projects.md):渲染图、PSD 交接与页面集管理
17
  - [从源码构建](build-from-source.md):使用 Bun、Tauri 与平台特性的本地构建流程
18
  - [故障排查](troubleshooting.md):启动、下载、GPU、流水线与连接问题的常见排查方法
 
13
  - [以 GUI、Headless 与 MCP 模式运行](run-gui-headless-and-mcp.md):本地运行模式与运行时参数
14
  - [配置 MCP 客户端](configure-mcp-clients.md):把 Antigravity、Claude Desktop 或 Claude Code 接到本地 MCP 端点
15
  - [使用 OpenAI 兼容 API](use-openai-compatible-api.md):连接 LM Studio、OpenRouter 与其他 OpenAI 风格的接口
16
+ - [使用 Codex 图像生成](use-codex-image-generation.md):使用 Codex 进行端到端 image-to-image 页面重绘
17
  - [导出页面与管理项目](export-and-manage-projects.md):渲染图、PSD 交接与页面集管理
18
  - [从源码构建](build-from-source.md):使用 Bun、Tauri 与平台特性的本地构建流程
19
  - [故障排查](troubleshooting.md):启动、下载、GPU、流水线与连接问题的常见排查方法
docs/zh-CN/how-to/use-codex-image-generation.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ title: 使用 Codex 图像生成
3
+ ---
4
+
5
+ # 使用 Codex 图像生成
6
+
7
+ Koharu 可以使用 Codex 进行端到端的 image-to-image 生成。这个流程会把源页面图像和提示词发送给 Codex,然后把生成出的图像保存为渲染后的页面结果。
8
+
9
+ ## 要求
10
+
11
+ - 拥有 Codex 访问权限的 ChatGPT 账号
12
+ - 已为该账号启用双重身份验证
13
+ - 能够访问 OpenAI 和 ChatGPT 服务的网络连接
14
+
15
+ 设备码登录要成功完成,必须先在账号上启用双重身份验证。
16
+
17
+ ## 这个功能会做什么
18
+
19
+ Codex image-to-image 生成是一个整页重绘流程。它可以根据源图像和提示词完成:
20
+
21
+ - 翻译可见文字
22
+ - 移除原始字稿
23
+ - 重绘被编辑的区域
24
+ - 保留分镜、气泡、网点和构图
25
+ - 一次生成完整页面图像
26
+
27
+ 这不同于 Koharu 的本地分阶段流水线。本地流水线会把检测、OCR、修复、翻译和渲染拆成独立步骤执行;Codex 流程会把页面图像发送到远程服务,并接收生成后的图像结果。
28
+
29
+ ## 提示词
30
+
31
+ 请用提示词描述你希望得到的整页结果。例如:
32
+
33
+ ```text
34
+ Translate all visible text to natural English, remove the original lettering,
35
+ and redraw the page as a clean manga image while preserving the artwork,
36
+ panel layout, speech bubbles, tone, and composition.
37
+ ```
38
+
39
+ 如果只想做更窄范围的编辑,请说明目标修改以及必须保留的内容。模型会收到源页面图像,所以提示词应重点描述转换目标,而不是重新列出每个视觉细节。
40
+
41
+ ## 隐私与可靠性
42
+
43
+ 这个功能会把源页面图像和提示词发送到 ChatGPT Codex 后端。如果你需要离线处理,或不希望把页面图像发送给远程提供商,请使用本地流水线。
44
+
45
+ Codex 图像生成依赖 OpenAI 的上游服务。生成失败时,如果上游返回了响应文本和请求 ID,Koharu 会将其显示出来。临时故障有时可以通过重试解决。持续失败可能与账号访问权限、服务可用性,或后端对图像生成工具调用的支持状态有关。
46
+
47
+ ## 何时使用
48
+
49
+ 当你希望用远程模型快速完成整页重绘,并接受模型改写最终图像时,可以使用 Codex 图像生成。
50
+
51
+ 当你需要更细致地控制中间 OCR、清理遮罩、翻译文本、字体和可编辑输出时,请使用本地分阶段流水线。