Mayo commited on
docs: add codex
Browse files- README.md +9 -0
- docs/en-US/explanation/models-and-providers.md +6 -0
- docs/en-US/how-to/index.md +1 -0
- docs/en-US/how-to/use-codex-image-generation.md +51 -0
- docs/ja-JP/explanation/models-and-providers.md +6 -0
- docs/ja-JP/how-to/index.md +1 -0
- docs/ja-JP/how-to/use-codex-image-generation.md +51 -0
- docs/pt-BR/explanation/models-and-providers.md +6 -0
- docs/pt-BR/how-to/index.md +1 -0
- docs/pt-BR/how-to/use-codex-image-generation.md +51 -0
- docs/zensical.ja-JP.toml +1 -0
- docs/zensical.pt-BR.toml +1 -0
- docs/zensical.toml +1 -0
- docs/zensical.zh-CN.toml +1 -0
- docs/zh-CN/explanation/models-and-providers.md +6 -0
- docs/zh-CN/how-to/index.md +1 -0
- docs/zh-CN/how-to/use-codex-image-generation.md +51 -0
README.md
CHANGED
|
@@ -39,6 +39,7 @@ Under the hood, Koharu uses [candle](https://github.com/huggingface/candle) and
|
|
| 39 |
- Inpainting to remove source lettering from the page
|
| 40 |
- Translation with local or remote LLM backends
|
| 41 |
- Advanced text rendering with vertical CJK and RTL support
|
|
|
|
| 42 |
- Layered PSD export with editable text
|
| 43 |
- Local HTTP API and MCP server for automation
|
| 44 |
|
|
@@ -255,6 +256,14 @@ Koharu supports hosted APIs from [OpenAI](https://platform.openai.com/), [Gemini
|
|
| 255 |
|
| 256 |
Built-in cloud defaults: OpenAI `gpt-5-mini`, Gemini `gemini-3.1-flash-lite-preview`, Claude `claude-haiku-4-5`, and DeepSeek `deepseek-chat`.
|
| 257 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 258 |
#### Machine Translation Providers
|
| 259 |
|
| 260 |
For pure machine-translation use cases, Koharu also supports [DeepL](https://www.deepl.com/), [Google Cloud Translation](https://cloud.google.com/translate), and [Caiyun](https://fanyi.caiyunapp.com/). These providers translate without an LLM-style chat or system prompt; you provide an API key and Koharu uses the upstream translate endpoint directly.
|
|
|
|
| 39 |
- Inpainting to remove source lettering from the page
|
| 40 |
- Translation with local or remote LLM backends
|
| 41 |
- Advanced text rendering with vertical CJK and RTL support
|
| 42 |
+
- Codex image-to-image generation for end-to-end page redraws from a source image and prompt
|
| 43 |
- Layered PSD export with editable text
|
| 44 |
- Local HTTP API and MCP server for automation
|
| 45 |
|
|
|
|
| 256 |
|
| 257 |
Built-in cloud defaults: OpenAI `gpt-5-mini`, Gemini `gemini-3.1-flash-lite-preview`, Claude `claude-haiku-4-5`, and DeepSeek `deepseek-chat`.
|
| 258 |
|
| 259 |
+
#### Codex Image-to-Image Generation
|
| 260 |
+
|
| 261 |
+
Koharu can use Codex for end-to-end image-to-image generation. This workflow sends the current source page image plus a user prompt to Codex, then stores the generated image as a rendered page result.
|
| 262 |
+
|
| 263 |
+
This feature requires a ChatGPT account with Codex access. Two-factor authentication must be enabled on the account before device-code login can complete successfully.
|
| 264 |
+
|
| 265 |
+
Codex image generation is useful when you want the model to translate visible text, remove the original lettering, and redraw the page in one pass. Because the image request is processed by the ChatGPT Codex backend, failures can include upstream OpenAI request IDs and may need to be retried.
|
| 266 |
+
|
| 267 |
#### Machine Translation Providers
|
| 268 |
|
| 269 |
For pure machine-translation use cases, Koharu also supports [DeepL](https://www.deepl.com/), [Google Cloud Translation](https://cloud.google.com/translate), and [Caiyun](https://fanyi.caiyunapp.com/). These providers translate without an LLM-style chat or system prompt; you provide an API key and Koharu uses the upstream translate endpoint directly.
|
docs/en-US/explanation/models-and-providers.md
CHANGED
|
@@ -112,6 +112,12 @@ Remote providers are configured in **Settings > API Keys**.
|
|
| 112 |
|
| 113 |
For a step-by-step setup guide for LM Studio, OpenRouter, and similar endpoints, see [Use OpenAI-Compatible APIs](../how-to/use-openai-compatible-api.md).
|
| 114 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
## Choosing between local and remote
|
| 116 |
|
| 117 |
Use local models when you want:
|
|
|
|
| 112 |
|
| 113 |
For a step-by-step setup guide for LM Studio, OpenRouter, and similar endpoints, see [Use OpenAI-Compatible APIs](../how-to/use-openai-compatible-api.md).
|
| 114 |
|
| 115 |
+
### Codex image generation
|
| 116 |
+
|
| 117 |
+
Koharu can also use Codex for end-to-end image-to-image generation. Instead of translating text blocks and rendering text locally as separate steps, this workflow sends the source page image and prompt to Codex and receives a generated page image.
|
| 118 |
+
|
| 119 |
+
This is a remote image-generation workflow, not a local model. It requires a ChatGPT account with Codex access and two-factor authentication enabled before device-code login can complete. See [Use Codex Image Generation](../how-to/use-codex-image-generation.md) for usage notes and caveats.
|
| 120 |
+
|
| 121 |
## Choosing between local and remote
|
| 122 |
|
| 123 |
Use local models when you want:
|
docs/en-US/how-to/index.md
CHANGED
|
@@ -13,6 +13,7 @@ How-to guides focus on concrete tasks you may want to complete with Koharu.
|
|
| 13 |
- [Run GUI, Headless, and MCP Modes](run-gui-headless-and-mcp.md): local deployment patterns and runtime flags
|
| 14 |
- [Configure MCP Clients](configure-mcp-clients.md): connect Antigravity, Claude Desktop, or Claude Code to Koharu's local MCP endpoint
|
| 15 |
- [Use OpenAI-Compatible APIs](use-openai-compatible-api.md): connect LM Studio, OpenRouter, and other OpenAI-style chat-completions endpoints
|
|
|
|
| 16 |
- [Export Pages and Manage Projects](export-and-manage-projects.md): rendered images, PSD handoff, and page-set management
|
| 17 |
- [Build From Source](build-from-source.md): local build flow with Bun, Tauri, and platform features
|
| 18 |
- [Troubleshooting](troubleshooting.md): common startup, download, GPU, pipeline, and connectivity failures
|
|
|
|
| 13 |
- [Run GUI, Headless, and MCP Modes](run-gui-headless-and-mcp.md): local deployment patterns and runtime flags
|
| 14 |
- [Configure MCP Clients](configure-mcp-clients.md): connect Antigravity, Claude Desktop, or Claude Code to Koharu's local MCP endpoint
|
| 15 |
- [Use OpenAI-Compatible APIs](use-openai-compatible-api.md): connect LM Studio, OpenRouter, and other OpenAI-style chat-completions endpoints
|
| 16 |
+
- [Use Codex Image Generation](use-codex-image-generation.md): use Codex for end-to-end image-to-image page redraws
|
| 17 |
- [Export Pages and Manage Projects](export-and-manage-projects.md): rendered images, PSD handoff, and page-set management
|
| 18 |
- [Build From Source](build-from-source.md): local build flow with Bun, Tauri, and platform features
|
| 19 |
- [Troubleshooting](troubleshooting.md): common startup, download, GPU, pipeline, and connectivity failures
|
docs/en-US/how-to/use-codex-image-generation.md
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Use Codex Image Generation
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# Use Codex Image Generation
|
| 6 |
+
|
| 7 |
+
Koharu can use Codex for end-to-end image-to-image generation. The workflow sends a source page image and a prompt to Codex, then stores the generated image as a rendered page result.
|
| 8 |
+
|
| 9 |
+
## Requirements
|
| 10 |
+
|
| 11 |
+
- a ChatGPT account with Codex access
|
| 12 |
+
- two-factor authentication enabled on that account
|
| 13 |
+
- network access to OpenAI and ChatGPT services
|
| 14 |
+
|
| 15 |
+
Two-factor authentication is required before device-code login can complete successfully.
|
| 16 |
+
|
| 17 |
+
## What the feature does
|
| 18 |
+
|
| 19 |
+
Codex image-to-image generation is a full-page redraw workflow. It can use the source image and prompt to:
|
| 20 |
+
|
| 21 |
+
- translate visible text
|
| 22 |
+
- remove original lettering
|
| 23 |
+
- redraw edited regions
|
| 24 |
+
- preserve panel layout, speech bubbles, tone, and composition
|
| 25 |
+
- produce a generated page image in one pass
|
| 26 |
+
|
| 27 |
+
This is separate from Koharu's staged local pipeline, where detection, OCR, inpainting, translation, and rendering run as individual steps. The Codex workflow sends the page image to a remote service and receives a generated image result.
|
| 28 |
+
|
| 29 |
+
## Prompting
|
| 30 |
+
|
| 31 |
+
Use a prompt that describes the complete page-level result you want. For example:
|
| 32 |
+
|
| 33 |
+
```text
|
| 34 |
+
Translate all visible text to natural English, remove the original lettering,
|
| 35 |
+
and redraw the page as a clean manga image while preserving the artwork,
|
| 36 |
+
panel layout, speech bubbles, tone, and composition.
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
For narrower edits, describe the target change and what must be preserved. The model receives the source page image, so the prompt should focus on transformation goals rather than restating every visual detail.
|
| 40 |
+
|
| 41 |
+
## Privacy and reliability
|
| 42 |
+
|
| 43 |
+
This feature sends the source page image and prompt to the ChatGPT Codex backend. Use the local pipeline instead when you need offline processing or do not want to send page images to a remote provider.
|
| 44 |
+
|
| 45 |
+
Codex image generation depends on OpenAI's upstream service. If generation fails, Koharu surfaces the upstream response text and request ID when available. Retrying can succeed if the failure is transient. Persistent failures may indicate account access, service availability, or backend support limitations for image-generation tool calls.
|
| 46 |
+
|
| 47 |
+
## When to use it
|
| 48 |
+
|
| 49 |
+
Use Codex image generation when you want a fast end-to-end redraw and are comfortable with a remote model rewriting the final image.
|
| 50 |
+
|
| 51 |
+
Use the local staged pipeline when you want more control over intermediate OCR, cleanup masks, translation text, fonts, and editable output.
|
docs/ja-JP/explanation/models-and-providers.md
CHANGED
|
@@ -112,6 +112,12 @@ LLM ベースのプロバイダで現在の組み込み既定値は次の通り
|
|
| 112 |
|
| 113 |
LM Studio、OpenRouter、類似エンドポイントの具体的な設定手順は [OpenAI 互換 API を使う](../how-to/use-openai-compatible-api.md) を参照してください。
|
| 114 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
## ローカルとリモートをどう選ぶか
|
| 116 |
|
| 117 |
ローカルモデルが向くケース:
|
|
|
|
| 112 |
|
| 113 |
LM Studio、OpenRouter、類似エンドポイントの具体的な設定手順は [OpenAI 互換 API を使う](../how-to/use-openai-compatible-api.md) を参照してください。
|
| 114 |
|
| 115 |
+
### Codex 画像生成
|
| 116 |
+
|
| 117 |
+
Koharu は Codex を使ったエンドツーエンドの image-to-image 生成にも対応しています。テキストブロックの翻訳とローカルレンダリングを別々の手順として行う代わりに、このワークフローでは元ページ画像とプロンプトを Codex に送り、生成されたページ画像を受け取ります。
|
| 118 |
+
|
| 119 |
+
これはローカルモデルではなく、リモート画像生成ワークフローです。Codex にアクセスできる ChatGPT アカウントと、デバイスコードログインを完了するための 2 要素認証が必要です。利用上の注意と制限は [Codex 画像生成を使う](../how-to/use-codex-image-generation.md) を参照してください。
|
| 120 |
+
|
| 121 |
## ローカルとリモートをどう選ぶか
|
| 122 |
|
| 123 |
ローカルモデルが向くケース:
|
docs/ja-JP/how-to/index.md
CHANGED
|
@@ -13,6 +13,7 @@ title: ハウツーガイド
|
|
| 13 |
- [GUI / Headless / MCP モードを使う](run-gui-headless-and-mcp.md): ローカルでの起動パターンと実行時フラグ
|
| 14 |
- [MCP クライアントを設定する](configure-mcp-clients.md): Antigravity、Claude Desktop、Claude Code を Koharu のローカル MCP エンドポイントに接続する
|
| 15 |
- [OpenAI 互換 API を使う](use-openai-compatible-api.md): LM Studio、OpenRouter、その他 OpenAI 形式の chat-completions エンドポイントを接続する
|
|
|
|
| 16 |
- [ページを書き出し、プロジェクトを管理する](export-and-manage-projects.md): レンダリング済み画像、PSD 引き渡し、ページセット管理
|
| 17 |
- [ソースからビルドする](build-from-source.md): Bun、Tauri、プラットフォーム機能を使ったローカルビルド手順
|
| 18 |
- [トラブルシューティング](troubleshooting.md): 起動、ダウンロード、GPU、パイプライン、接続まわりの典型的な問題
|
|
|
|
| 13 |
- [GUI / Headless / MCP モードを使う](run-gui-headless-and-mcp.md): ローカルでの起動パターンと実行時フラグ
|
| 14 |
- [MCP クライアントを設定する](configure-mcp-clients.md): Antigravity、Claude Desktop、Claude Code を Koharu のローカル MCP エンドポイントに接続する
|
| 15 |
- [OpenAI 互換 API を使う](use-openai-compatible-api.md): LM Studio、OpenRouter、その他 OpenAI 形式の chat-completions エンドポイントを接続する
|
| 16 |
+
- [Codex 画像生成を使う](use-codex-image-generation.md): Codex で image-to-image のページ全体再描画を行う
|
| 17 |
- [ページを書き出し、プロジェクトを管理する](export-and-manage-projects.md): レンダリング済み画像、PSD 引き渡し、ページセット管理
|
| 18 |
- [ソースからビルドする](build-from-source.md): Bun、Tauri、プラットフォーム機能を使ったローカルビルド手順
|
| 19 |
- [トラブルシューティング](troubleshooting.md): 起動、ダウンロード、GPU、パイプライン、接続まわりの典型的な問題
|
docs/ja-JP/how-to/use-codex-image-generation.md
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Codex 画像生成を使う
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# Codex 画像生成を使う
|
| 6 |
+
|
| 7 |
+
Koharu は Codex を使ったエンドツーエンドの image-to-image 生成に対応しています。このワークフローでは、元ページ画像とプロンプトを Codex に送り、生成された画像をレンダリング済みページ結果として保存します。
|
| 8 |
+
|
| 9 |
+
## 要件
|
| 10 |
+
|
| 11 |
+
- Codex にアクセスできる ChatGPT アカウント
|
| 12 |
+
- そのアカウントで有効化された 2 要素認証
|
| 13 |
+
- OpenAI と ChatGPT サービスへ接続できるネットワーク
|
| 14 |
+
|
| 15 |
+
デバイスコードログインを正常に完了するには、事前に 2 要素認証を有効にしておく必要があります。
|
| 16 |
+
|
| 17 |
+
## この機能でできること
|
| 18 |
+
|
| 19 |
+
Codex の image-to-image 生成は、ページ全体を描き直すワークフローです。元画像とプロンプトを使って、次のような処理を 1 回で行えます。
|
| 20 |
+
|
| 21 |
+
- 表示されている文字を翻訳する
|
| 22 |
+
- 元の文字を消す
|
| 23 |
+
- 編集された領域を描き直す
|
| 24 |
+
- コマ割り、吹き出し、トーン、構図を保つ
|
| 25 |
+
- 生成済みのページ画像を出力する
|
| 26 |
+
|
| 27 |
+
これは Koharu の段階的なローカルパイプラインとは別の機能です。ローカルパイプラインでは、検出、OCR、インペイント、翻訳、レンダリングを個別のステップとして実行します。Codex ワークフローでは、ページ画像をリモートサービスへ送り、生成画像を受け取ります。
|
| 28 |
+
|
| 29 |
+
## プロンプト
|
| 30 |
+
|
| 31 |
+
ページ全体としてほしい結果を説明するプロンプトを使ってください。例:
|
| 32 |
+
|
| 33 |
+
```text
|
| 34 |
+
Translate all visible text to natural English, remove the original lettering,
|
| 35 |
+
and redraw the page as a clean manga image while preserving the artwork,
|
| 36 |
+
panel layout, speech bubbles, tone, and composition.
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
より狭い編集では、変更したい内容と維持したい要素を明確に書きます。モデルには元ページ画像も渡されるため、プロンプトでは細部をすべて説明するよりも、変換の目的を中心に書くと扱いやすくなります。
|
| 40 |
+
|
| 41 |
+
## プライバシーと信頼性
|
| 42 |
+
|
| 43 |
+
この機能は、元ページ画像とプロンプトを ChatGPT Codex バックエンドへ送信します。オフライン処理が必要な場合や、ページ画像をリモートプロバイダーへ送信したくない場合は、ローカルパイプラインを使用してください。
|
| 44 |
+
|
| 45 |
+
Codex 画像生成は OpenAI の上流サービスに依存します。生成に失敗した場合、利用可能であれば Koharu は上流の応答テキストとリクエスト ID を表示します。一時的な失敗であれば再試行で成功することがあります。失敗が続く場合は、アカウントのアクセス権、サービスの可用性、または画像生成ツール呼び出しに対するバックエンド側の対応状況が原因の可能性があります。
|
| 46 |
+
|
| 47 |
+
## 使い分け
|
| 48 |
+
|
| 49 |
+
リモートモデルで最終画像を一括生成したい場合は、Codex 画像生成を使います。
|
| 50 |
+
|
| 51 |
+
中間の OCR、クリーンアップマスク、翻訳テキスト、フォント、編集可能な出力を細かく制御したい場合は、ローカルの段階的パイプラインを使います。
|
docs/pt-BR/explanation/models-and-providers.md
CHANGED
|
@@ -112,6 +112,12 @@ Os provedores remotos são configurados em **Configurações > Chaves de API**.
|
|
| 112 |
|
| 113 |
Para um guia passo a passo de configuração para LM Studio, OpenRouter e endpoints similares, veja [Usar APIs Compatíveis com OpenAI](../how-to/use-openai-compatible-api.md).
|
| 114 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
## Escolhendo entre local e remoto
|
| 116 |
|
| 117 |
Use modelos locais quando você quer:
|
|
|
|
| 112 |
|
| 113 |
Para um guia passo a passo de configuração para LM Studio, OpenRouter e endpoints similares, veja [Usar APIs Compatíveis com OpenAI](../how-to/use-openai-compatible-api.md).
|
| 114 |
|
| 115 |
+
### Geração de imagem com Codex
|
| 116 |
+
|
| 117 |
+
O Koharu também pode usar o Codex para geração image-to-image de ponta a ponta. Em vez de traduzir blocos de texto e renderizar texto localmente como etapas separadas, esse fluxo envia a imagem de página de origem e o prompt ao Codex e recebe uma imagem de página gerada.
|
| 118 |
+
|
| 119 |
+
Esse é um fluxo remoto de geração de imagem, não um modelo local. Ele exige uma conta ChatGPT com acesso ao Codex e autenticação de dois fatores habilitada para concluir o login por código de dispositivo. Consulte [Usar Geração de Imagem com Codex](../how-to/use-codex-image-generation.md) para notas de uso e limitações.
|
| 120 |
+
|
| 121 |
## Escolhendo entre local e remoto
|
| 122 |
|
| 123 |
Use modelos locais quando você quer:
|
docs/pt-BR/how-to/index.md
CHANGED
|
@@ -13,6 +13,7 @@ Os guias práticos tratam de tarefas concretas que você pode querer realizar co
|
|
| 13 |
- [Executar nos Modos GUI, Headless e MCP](run-gui-headless-and-mcp.md): padrões de deploy local e flags de runtime
|
| 14 |
- [Configurar Clientes MCP](configure-mcp-clients.md): conectar Antigravity, Claude Desktop ou Claude Code ao endpoint MCP local do Koharu
|
| 15 |
- [Usar APIs Compatíveis com OpenAI](use-openai-compatible-api.md): conectar LM Studio, OpenRouter e outros endpoints de chat-completions no formato OpenAI
|
|
|
|
| 16 |
- [Exportar Páginas e Gerenciar Projetos](export-and-manage-projects.md): imagens renderizadas, entrega em PSD e gerenciamento de conjuntos de páginas
|
| 17 |
- [Build a Partir do Código-Fonte](build-from-source.md): fluxo de build local com Bun, Tauri e features de plataforma
|
| 18 |
- [Troubleshooting](troubleshooting.md): falhas comuns de inicialização, download, GPU, pipeline e conectividade
|
|
|
|
| 13 |
- [Executar nos Modos GUI, Headless e MCP](run-gui-headless-and-mcp.md): padrões de deploy local e flags de runtime
|
| 14 |
- [Configurar Clientes MCP](configure-mcp-clients.md): conectar Antigravity, Claude Desktop ou Claude Code ao endpoint MCP local do Koharu
|
| 15 |
- [Usar APIs Compatíveis com OpenAI](use-openai-compatible-api.md): conectar LM Studio, OpenRouter e outros endpoints de chat-completions no formato OpenAI
|
| 16 |
+
- [Usar Geração de Imagem com Codex](use-codex-image-generation.md): usar o Codex para redesenho image-to-image de páginas inteiras
|
| 17 |
- [Exportar Páginas e Gerenciar Projetos](export-and-manage-projects.md): imagens renderizadas, entrega em PSD e gerenciamento de conjuntos de páginas
|
| 18 |
- [Build a Partir do Código-Fonte](build-from-source.md): fluxo de build local com Bun, Tauri e features de plataforma
|
| 19 |
- [Troubleshooting](troubleshooting.md): falhas comuns de inicialização, download, GPU, pipeline e conectividade
|
docs/pt-BR/how-to/use-codex-image-generation.md
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: Usar Geração de Imagem com Codex
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# Usar Geração de Imagem com Codex
|
| 6 |
+
|
| 7 |
+
O Koharu pode usar o Codex para geração image-to-image de ponta a ponta. Esse fluxo envia uma imagem de página de origem e um prompt ao Codex, depois salva a imagem gerada como resultado renderizado da página.
|
| 8 |
+
|
| 9 |
+
## Requisitos
|
| 10 |
+
|
| 11 |
+
- uma conta ChatGPT com acesso ao Codex
|
| 12 |
+
- autenticação de dois fatores habilitada nessa conta
|
| 13 |
+
- acesso de rede aos serviços da OpenAI e do ChatGPT
|
| 14 |
+
|
| 15 |
+
A autenticação de dois fatores precisa estar habilitada antes que o login por código de dispositivo possa ser concluído com sucesso.
|
| 16 |
+
|
| 17 |
+
## O que o recurso faz
|
| 18 |
+
|
| 19 |
+
A geração image-to-image do Codex é um fluxo de redesenho de página inteira. Ela pode usar a imagem de origem e o prompt para:
|
| 20 |
+
|
| 21 |
+
- traduzir o texto visível
|
| 22 |
+
- remover as letras originais
|
| 23 |
+
- redesenhar regiões editadas
|
| 24 |
+
- preservar layout dos painéis, balões, retículas e composição
|
| 25 |
+
- produzir uma imagem de página gerada em uma única passagem
|
| 26 |
+
|
| 27 |
+
Isso é separado do pipeline local em etapas do Koharu, no qual detecção, OCR, inpainting, tradução e renderização rodam como passos individuais. O fluxo do Codex envia a imagem da página para um serviço remoto e recebe uma imagem gerada como resultado.
|
| 28 |
+
|
| 29 |
+
## Prompt
|
| 30 |
+
|
| 31 |
+
Use um prompt que descreva o resultado final desejado para a página inteira. Por exemplo:
|
| 32 |
+
|
| 33 |
+
```text
|
| 34 |
+
Translate all visible text to natural English, remove the original lettering,
|
| 35 |
+
and redraw the page as a clean manga image while preserving the artwork,
|
| 36 |
+
panel layout, speech bubbles, tone, and composition.
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
Para edições mais estreitas, descreva a alteração desejada e o que precisa ser preservado. Como o modelo recebe a imagem da página de origem, o prompt deve focar nos objetivos de transformação em vez de repetir todos os detalhes visuais.
|
| 40 |
+
|
| 41 |
+
## Privacidade e confiabilidade
|
| 42 |
+
|
| 43 |
+
Esse recurso envia a imagem da página de origem e o prompt ao backend do ChatGPT Codex. Use o pipeline local quando precisar de processamento offline ou não quiser enviar imagens de páginas para um provedor remoto.
|
| 44 |
+
|
| 45 |
+
A geração de imagem do Codex depende do serviço upstream da OpenAI. Se a geração falhar, o Koharu mostra o texto de resposta upstream e o ID da requisição quando disponíveis. Tentar novamente pode resolver falhas transitórias. Falhas persistentes podem indicar limitações de acesso da conta, disponibilidade do serviço ou suporte do backend para chamadas da ferramenta de geração de imagem.
|
| 46 |
+
|
| 47 |
+
## Quando usar
|
| 48 |
+
|
| 49 |
+
Use a geração de imagem do Codex quando quiser um redesenho de ponta a ponta rápido e aceitar que um modelo remoto reescreva a imagem final.
|
| 50 |
+
|
| 51 |
+
Use o pipeline local em etapas quando quiser mais controle sobre OCR intermediário, máscaras de limpeza, texto traduzido, fontes e saída editável.
|
docs/zensical.ja-JP.toml
CHANGED
|
@@ -21,6 +21,7 @@ nav = [
|
|
| 21 |
"how-to/run-gui-headless-and-mcp.md",
|
| 22 |
"how-to/configure-mcp-clients.md",
|
| 23 |
"how-to/use-openai-compatible-api.md",
|
|
|
|
| 24 |
"how-to/export-and-manage-projects.md",
|
| 25 |
"how-to/build-from-source.md",
|
| 26 |
"how-to/troubleshooting.md",
|
|
|
|
| 21 |
"how-to/run-gui-headless-and-mcp.md",
|
| 22 |
"how-to/configure-mcp-clients.md",
|
| 23 |
"how-to/use-openai-compatible-api.md",
|
| 24 |
+
"how-to/use-codex-image-generation.md",
|
| 25 |
"how-to/export-and-manage-projects.md",
|
| 26 |
"how-to/build-from-source.md",
|
| 27 |
"how-to/troubleshooting.md",
|
docs/zensical.pt-BR.toml
CHANGED
|
@@ -21,6 +21,7 @@ nav = [
|
|
| 21 |
"how-to/run-gui-headless-and-mcp.md",
|
| 22 |
"how-to/configure-mcp-clients.md",
|
| 23 |
"how-to/use-openai-compatible-api.md",
|
|
|
|
| 24 |
"how-to/export-and-manage-projects.md",
|
| 25 |
"how-to/build-from-source.md",
|
| 26 |
"how-to/troubleshooting.md",
|
|
|
|
| 21 |
"how-to/run-gui-headless-and-mcp.md",
|
| 22 |
"how-to/configure-mcp-clients.md",
|
| 23 |
"how-to/use-openai-compatible-api.md",
|
| 24 |
+
"how-to/use-codex-image-generation.md",
|
| 25 |
"how-to/export-and-manage-projects.md",
|
| 26 |
"how-to/build-from-source.md",
|
| 27 |
"how-to/troubleshooting.md",
|
docs/zensical.toml
CHANGED
|
@@ -21,6 +21,7 @@ nav = [
|
|
| 21 |
"how-to/run-gui-headless-and-mcp.md",
|
| 22 |
"how-to/configure-mcp-clients.md",
|
| 23 |
"how-to/use-openai-compatible-api.md",
|
|
|
|
| 24 |
"how-to/export-and-manage-projects.md",
|
| 25 |
"how-to/build-from-source.md",
|
| 26 |
"how-to/troubleshooting.md",
|
|
|
|
| 21 |
"how-to/run-gui-headless-and-mcp.md",
|
| 22 |
"how-to/configure-mcp-clients.md",
|
| 23 |
"how-to/use-openai-compatible-api.md",
|
| 24 |
+
"how-to/use-codex-image-generation.md",
|
| 25 |
"how-to/export-and-manage-projects.md",
|
| 26 |
"how-to/build-from-source.md",
|
| 27 |
"how-to/troubleshooting.md",
|
docs/zensical.zh-CN.toml
CHANGED
|
@@ -21,6 +21,7 @@ nav = [
|
|
| 21 |
"how-to/run-gui-headless-and-mcp.md",
|
| 22 |
"how-to/configure-mcp-clients.md",
|
| 23 |
"how-to/use-openai-compatible-api.md",
|
|
|
|
| 24 |
"how-to/export-and-manage-projects.md",
|
| 25 |
"how-to/build-from-source.md",
|
| 26 |
"how-to/troubleshooting.md",
|
|
|
|
| 21 |
"how-to/run-gui-headless-and-mcp.md",
|
| 22 |
"how-to/configure-mcp-clients.md",
|
| 23 |
"how-to/use-openai-compatible-api.md",
|
| 24 |
+
"how-to/use-codex-image-generation.md",
|
| 25 |
"how-to/export-and-manage-projects.md",
|
| 26 |
"how-to/build-from-source.md",
|
| 27 |
"how-to/troubleshooting.md",
|
docs/zh-CN/explanation/models-and-providers.md
CHANGED
|
@@ -112,6 +112,12 @@ LLM 驱动提供商当前内置的默认模型如下:
|
|
| 112 |
|
| 113 |
如果你需要 LM Studio、OpenRouter 或类似端点的逐步配置说明,请参见 [使用 OpenAI 兼容 API](../how-to/use-openai-compatible-api.md)。
|
| 114 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
## 如何在本地与远程之间选择
|
| 116 |
|
| 117 |
以下情况更适合本地模型:
|
|
|
|
| 112 |
|
| 113 |
如果你需要 LM Studio、OpenRouter 或类似端点的逐步配置说明,请参见 [使用 OpenAI 兼容 API](../how-to/use-openai-compatible-api.md)。
|
| 114 |
|
| 115 |
+
### Codex 图像生成
|
| 116 |
+
|
| 117 |
+
Koharu 也可以使用 Codex 进行端到端 image-to-image 生成。它不会把文本块翻译和本地文字渲染作为独立步骤处理,而是把源页面图像和提示词发送给 Codex,并接收生成后的页面图像。
|
| 118 |
+
|
| 119 |
+
这是远程图像生成流程,不是本地模型。它需要拥有 Codex 访问权限的 ChatGPT 账号,并且必须启用双重身份验证才能完成设备码登录。使用说明和注意事项见 [使用 Codex 图像生成](../how-to/use-codex-image-generation.md)。
|
| 120 |
+
|
| 121 |
## 如何在本地与远程之间选择
|
| 122 |
|
| 123 |
以下情况更适合本地模型:
|
docs/zh-CN/how-to/index.md
CHANGED
|
@@ -13,6 +13,7 @@ title: 操作指南
|
|
| 13 |
- [以 GUI、Headless 与 MCP 模式运行](run-gui-headless-and-mcp.md):本地运行模式与运行时参数
|
| 14 |
- [配置 MCP 客户端](configure-mcp-clients.md):把 Antigravity、Claude Desktop 或 Claude Code 接到本地 MCP 端点
|
| 15 |
- [使用 OpenAI 兼容 API](use-openai-compatible-api.md):连接 LM Studio、OpenRouter 与其他 OpenAI 风格的接口
|
|
|
|
| 16 |
- [导出页面与管理项目](export-and-manage-projects.md):渲染图、PSD 交接与页面集管理
|
| 17 |
- [从源码构建](build-from-source.md):使用 Bun、Tauri 与平台特性的本地构建流程
|
| 18 |
- [故障排查](troubleshooting.md):启动、下载、GPU、流水线与连接问题的常见排查方法
|
|
|
|
| 13 |
- [以 GUI、Headless 与 MCP 模式运行](run-gui-headless-and-mcp.md):本地运行模式与运行时参数
|
| 14 |
- [配置 MCP 客户端](configure-mcp-clients.md):把 Antigravity、Claude Desktop 或 Claude Code 接到本地 MCP 端点
|
| 15 |
- [使用 OpenAI 兼容 API](use-openai-compatible-api.md):连接 LM Studio、OpenRouter 与其他 OpenAI 风格的接口
|
| 16 |
+
- [使用 Codex 图像生成](use-codex-image-generation.md):使用 Codex 进行端到端 image-to-image 页面重绘
|
| 17 |
- [导出页面与管理项目](export-and-manage-projects.md):渲染图、PSD 交接与页面集管理
|
| 18 |
- [从源码构建](build-from-source.md):使用 Bun、Tauri 与平台特性的本地构建流程
|
| 19 |
- [故障排查](troubleshooting.md):启动、下载、GPU、流水线与连接问题的常见排查方法
|
docs/zh-CN/how-to/use-codex-image-generation.md
ADDED
|
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: 使用 Codex 图像生成
|
| 3 |
+
---
|
| 4 |
+
|
| 5 |
+
# 使用 Codex 图像生成
|
| 6 |
+
|
| 7 |
+
Koharu 可以使用 Codex 进行端到端的 image-to-image 生成。这个流程会把源页面图像和提示词发送给 Codex,然后把生成出的图像保存为渲染后的页面结果。
|
| 8 |
+
|
| 9 |
+
## 要求
|
| 10 |
+
|
| 11 |
+
- 拥有 Codex 访问权限的 ChatGPT 账号
|
| 12 |
+
- 已为该账号启用双重身份验证
|
| 13 |
+
- 能够访问 OpenAI 和 ChatGPT 服务的网络连接
|
| 14 |
+
|
| 15 |
+
设备码登录要成功完成,必须先在账号上启用双重身份验证。
|
| 16 |
+
|
| 17 |
+
## 这个功能会做什么
|
| 18 |
+
|
| 19 |
+
Codex image-to-image 生成是一个整页重绘流程。它可以根据源图像和提示词完成:
|
| 20 |
+
|
| 21 |
+
- 翻译可见文字
|
| 22 |
+
- 移除原始字稿
|
| 23 |
+
- 重绘被编辑的区域
|
| 24 |
+
- 保留分镜、气泡、网点和构图
|
| 25 |
+
- 一次生成完整页面图像
|
| 26 |
+
|
| 27 |
+
这不同于 Koharu 的本地分阶段流水线。本地流水线会把检测、OCR、修复、翻译和渲染拆成独立步骤执行;Codex 流程会把页面图像发送到远程服务,并接收生成后的图像结果。
|
| 28 |
+
|
| 29 |
+
## 提示词
|
| 30 |
+
|
| 31 |
+
请用提示词描述你希望得到的整页结果。例如:
|
| 32 |
+
|
| 33 |
+
```text
|
| 34 |
+
Translate all visible text to natural English, remove the original lettering,
|
| 35 |
+
and redraw the page as a clean manga image while preserving the artwork,
|
| 36 |
+
panel layout, speech bubbles, tone, and composition.
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
如果只想做更窄范围的编辑,请说明目标修改以及必须保留的内容。模型会收到源页面图像,所以提示词应重点描述转换目标,而不是重新列出每个视觉细节。
|
| 40 |
+
|
| 41 |
+
## 隐私与可靠性
|
| 42 |
+
|
| 43 |
+
这个功能会把源页面图像和提示词发送到 ChatGPT Codex 后端。如果你需要离线处理,或不希望把页面图像发送给远程提供商,请使用本地流水线。
|
| 44 |
+
|
| 45 |
+
Codex 图像生成依赖 OpenAI 的上游服务。生成失败时,如果上游返回了响应文本和请求 ID,Koharu 会将其显示出来。临时故障有时可以通过重试解决。持续失败可能与账号访问权限、服务可用性,或后端对图像生成工具调用的支持状态有关。
|
| 46 |
+
|
| 47 |
+
## 何时使用
|
| 48 |
+
|
| 49 |
+
当你希望用远程模型快速完成整页重绘,并接受模型改写最终图像时,可以使用 Codex 图像生成。
|
| 50 |
+
|
| 51 |
+
当你需要更细致地控制中间 OCR、清理遮罩、翻译文本、字体和可编辑输出时,请使用本地分阶段流水线。
|