| # Create new file | |
| <div align="center"> | |
| [English](../README.md) | [็ฎไฝไธญๆ](README_zh-CN.md) | [็น้ซไธญๆ](README_zh-TW.md) | [ๆฅๆฌ่ช](README_ja-JP.md) | ํ๊ตญ์ด | |
| <img src="./images/banner.png" width="320px" alt="PDF2ZH"/> | |
| <h2 id="title">PDFMathTranslate</h2> | |
| <p> | |
| <!-- PyPI --> | |
| <a href="https://pypi.org/project/pdf2zh/"> | |
| <img src="https://img.shields.io/pypi/v/pdf2zh"/></a> | |
| <a href="https://pepy.tech/projects/pdf2zh"> | |
| <img src="https://static.pepy.tech/badge/pdf2zh"></a> | |
| <a href="https://hub.docker.com/repository/docker/byaidu/pdf2zh"> | |
| <img src="https://img.shields.io/docker/pulls/byaidu/pdf2zh"></a> | |
| <!-- License --> | |
| <a href="./LICENSE"> | |
| <img src="https://img.shields.io/github/license/Byaidu/PDFMathTranslate"/></a> | |
| <a href="https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker"> | |
| <img src="https://img.shields.io/badge/%F0%9F%A4%97-Online%20Demo-FF9E0D"/></a> | |
| <a href="https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate"> | |
| <img src="https://img.shields.io/badge/ModelScope-Demo-blue"></a> | |
| <a href="https://github.com/Byaidu/PDFMathTranslate/pulls"> | |
| <img src="https://img.shields.io/badge/contributions-welcome-green"/></a> | |
| <a href="https://gitcode.com/Byaidu/PDFMathTranslate/overview"> | |
| <img src="https://gitcode.com/Byaidu/PDFMathTranslate/star/badge.svg"></a> | |
| <a href="https://t.me/+Z9_SgnxmsmA5NzBl"> | |
| <img src="https://img.shields.io/badge/Telegram-2CA5E0?style=flat-squeare&logo=telegram&logoColor=white"/></a> | |
| </p> | |
| <a href="https://trendshift.io/repositories/12424" target="_blank"><img src="https://trendshift.io/api/badge/repositories/12424" alt="Byaidu%2FPDFMathTranslate | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> | |
| </div> | |
| ๊ณผํ PDF ๋ฌธ์ ๋ฒ์ญ ๋ฐ ์ด์ค ์ธ์ด ๋น๊ต ๋๊ตฌ | |
| - ๐ ์์, ์ฐจํธ, ๋ชฉ์ฐจ, ์ฃผ์ ์ ์ง _([๋ฏธ๋ฆฌ๋ณด๊ธฐ](#preview))_ | |
| - ๐ [๋ค์ํ ์ธ์ด](#language)์ [๋ค์ํ ๋ฒ์ญ ์๋น์ค](#services) ์ง์ | |
| - ๐ค [์ปค๋งจ๋๋ผ์ธ ๋๊ตฌ](#usage), [๋ํํ ์ฌ์ฉ์ ์ธํฐํ์ด์ค](#gui), ๋ฐ [Docker](#docker) ์ ๊ณต | |
| ํผ๋๋ฐฑ์ [GitHub Issues](https://github.com/Byaidu/PDFMathTranslate/issues) ๋๋ [Telegram ๊ทธ๋ฃน](https://t.me/+Z9_SgnxmsmA5NzBl)์์ ํด์ฃผ์ธ์. | |
| <h2 id="updates">์ต๊ทผ ์ ๋ฐ์ดํธ</h2> | |
| - [2024๋ 12์ 24์ผ] [Xinference](https://github.com/xorbitsai/inference) ์คํ ๋ก์ปฌ LLM ์ง์ ์ถ๊ฐ _(by [@imClumsyPanda](https://github.com/imClumsyPanda))_ | |
| - [2024๋ 11์ 26์ผ] CLI๊ฐ ์จ๋ผ์ธ ํ์ผ์ ์ง์ํ๊ฒ ๋์์ต๋๋ค _(by [@reycn](https://github.com/reycn))_ | |
| - [2024๋ 11์ 24์ผ] ์์กด์ฑ ํฌ๊ธฐ๋ฅผ ์ค์ด๊ธฐ ์ํด [ONNX](https://github.com/onnx/onnx) ์ง์ ์ถ๊ฐ _(by [@Wybxc](https://github.com/Wybxc))_ | |
| - [2024๋ 11์ 23์ผ] ๐ [๋ฌด๋ฃ ๊ณต๊ณต ์๋น์ค](#demo) ์จ๋ผ์ธ! _(by [@Byaidu](https://github.com/Byaidu))_ | |
| - [2024๋ 11์ 23์ผ] ์น ๋ด์ ๋ฐฉ์งํ๊ธฐ ์ํ ๋ฐฉํ๋ฒฝ ์ถ๊ฐ _(by [@Byaidu](https://github.com/Byaidu))_ | |
| - [2024๋ 11์ 22์ผ] GUI๊ฐ ์ดํ๋ฆฌ์์ด๋ฅผ ์ง์ํ๊ณ ๊ฐ์ ๋์์ต๋๋ค _(by [@Byaidu](https://github.com/Byaidu), [@reycn](https://github.com/reycn))_ | |
| - [2024๋ 11์ 22์ผ] ๋ฐฐํฌ๋ ์๋น์ค๋ฅผ ๋ค๋ฅธ ์ฌ๋๊ณผ ๊ณต์ ํ ์ ์๊ฒ ๋์์ต๋๋ค _(by [@Zxis233](https://github.com/Zxis233))_ | |
| - [2024๋ 11์ 22์ผ] Tencent ๋ฒ์ญ ์ง์ _(by [@hellofinch](https://github.com/hellofinch))_ | |
| - [2024๋ 11์ 21์ผ] GUI๊ฐ ์ด์ค ์ธ์ด ๋ฌธ์ ๋ค์ด๋ก๋๋ฅผ ์ง์ํ๊ฒ ๋์์ต๋๋ค _(by [@reycn](https://github.com/reycn))_ | |
| - [2024๋ 11์ 20์ผ] ๐ [๋ฐ๋ชจ](#demo)๊ฐ ์จ๋ผ์ธ์ด ๋์์ต๋๋ค! _(by [@reycn](https://github.com/reycn))_ | |
| <h2 id="preview">๋ฏธ๋ฆฌ๋ณด๊ธฐ</h2> | |
| <div align="center"> | |
| <img src="./images/preview.gif" width="80%"/> | |
| </div> | |
| <h2 id="demo">๊ณต๊ณต ์๋น์ค ๐</h2> | |
| ### ๋ฌด๋ฃ ์๋น์ค (<https://pdf2zh.com/>) | |
| ์ค์น ์์ด [๋ฌด๋ฃ ๊ณต๊ณต ์๋น์ค](https://pdf2zh.com/)๋ฅผ ์จ๋ผ์ธ์ผ๋ก ์ฌ์ฉํด ๋ณผ ์ ์์ต๋๋ค. | |
| ### ๋ฐ๋ชจ | |
| ์ค์น ์์ด [HuggingFace์ ๋ฐ๋ชจ](https://huggingface.co/spaces/reycn/PDFMathTranslate-Docker)์ [ModelScope์ ๋ฐ๋ชจ](https://www.modelscope.cn/studios/AI-ModelScope/PDFMathTranslate)๋ฅผ ์ฌ์ฉํด ๋ณผ ์ ์์ต๋๋ค. | |
| ๋ฐ๋ชจ์ ์ปดํจํ ๋ฆฌ์์ค๊ฐ ์ ํ๋์ด ์์ผ๋ฏ๋ก ๋จ์ฉํ์ง ๋ง์์ฃผ์ธ์. | |
| <h2 id="install">์ค์น ๋ฐ ์ฌ์ฉ๋ฒ</h2> | |
| ์ด ํ๋ก์ ํธ๋ฅผ ์ฌ์ฉํ๋ 4๊ฐ์ง ๋ฐฉ๋ฒ์ ์ ๊ณตํฉ๋๋ค: [์ปค๋งจ๋๋ผ์ธ ๋๊ตฌ](#cmd), [ํฌํฐ๋ธ](#portable), [GUI](#gui), ๋ฐ [Docker](#docker). | |
| pdf2zh ์คํ์๋ ์ถ๊ฐ ๋ชจ๋ธ(`wybxc/DocLayout-YOLO-DocStructBench-onnx`)์ด ํ์ํฉ๋๋ค. ์ด ๋ชจ๋ธ์ ModelScope์์๋ ์ฐพ์ ์ ์์ต๋๋ค. ์์ํ ๋ ์ด ๋ชจ๋ธ ๋ค์ด๋ก๋์ ๋ฌธ์ ๊ฐ ์๋ค๋ฉด ๋ค์ ํ๊ฒฝ ๋ณ์๋ฅผ ์ฌ์ฉํ์ธ์: | |
| ```shell | |
| set HF_ENDPOINT=https://hf-mirror.com | |
| ``` | |
| PowerShell ์ฌ์ฉ์์ ๊ฒฝ์ฐ: | |
| ```shell | |
| $env:HF_ENDPOINT = https://hf-mirror.com | |
| ``` | |
| <h3 id="cmd">๋ฐฉ๋ฒ 1. ์ปค๋งจ๋๋ผ์ธ ๋๊ตฌ</h3> | |
| 1. Python์ด ์ค์น๋์ด ์์ด์ผ ํฉ๋๋ค (๋ฒ์ 3.10 <= ๋ฒ์ <= 3.12) | |
| 2. ํจํค์ง๋ฅผ ์ค์นํฉ๋๋ค: | |
| ```bash | |
| pip install pdf2zh | |
| ``` | |
| 3. ๋ฒ์ญ์ ์คํํ๊ณ [ํ์ฌ ์์ ๋๋ ํ ๋ฆฌ](https://chatgpt.com/share/6745ed36-9acc-800e-8a90-59204bd13444)์ ํ์ผ์ ์์ฑํฉ๋๋ค: | |
| ```bash | |
| pdf2zh document.pdf | |
| ``` | |
| <h3 id="portable">๋ฐฉ๋ฒ 2. ํฌํฐ๋ธ</h3> | |
| Python ํ๊ฒฝ์ ๋ฏธ๋ฆฌ ์ค์นํ ํ์๊ฐ ์์ต๋๋ค. | |
| [setup.bat](https://raw.githubusercontent.com/Byaidu/PDFMathTranslate/refs/heads/main/script/setup.bat)์ ๋ค์ด๋ก๋ํ๊ณ ๋๋ธํด๋ฆญํ์ฌ ์คํํฉ๋๋ค. | |
| <h3 id="gui">๋ฐฉ๋ฒ 3. GUI</h3> | |
| 1. Python์ด ์ค์น๋์ด ์์ด์ผ ํฉ๋๋ค (๋ฒ์ 3.10 <= ๋ฒ์ <= 3.12) | |
| 2. ํจํค์ง๋ฅผ ์ค์นํฉ๋๋ค: | |
| ```bash | |
| pip install pdf2zh | |
| ``` | |
| 3. ๋ธ๋ผ์ฐ์ ์์ ์ฌ์ฉ์ ์์ํฉ๋๋ค: | |
| ```bash | |
| pdf2zh -i | |
| ``` | |
| 4. ๋ธ๋ผ์ฐ์ ๊ฐ ์๋์ผ๋ก ์์๋์ง ์์ผ๋ฉด ๋ค์ URL์ ์ฝ๋๋ค: | |
| ```bash | |
| http://localhost:7860/ | |
| ``` | |
| <img src="./images/gui.gif" width="500"/> | |
| ์์ธํ ๋ด์ฉ์ [GUI ๋ฌธ์](./README_GUI.md)๋ฅผ ์ฐธ์กฐํ์ธ์. | |
| <h3 id="docker">๋ฐฉ๋ฒ 4. Docker</h3> | |
| 1. ํํ๊ณ ์คํํฉ๋๋ค: | |
| ```bash | |
| docker pull byaidu/pdf2zh | |
| docker run -d -p 7860:7860 byaidu/pdf2zh | |
| ``` | |
| 2. ๋ธ๋ผ์ฐ์ ์์ ์ฝ๋๋ค: | |
| ``` | |
| http://localhost:7860/ | |
| ``` | |
| ํด๋ผ์ฐ๋ ์๋น์ค์์ Docker ๋ฐฐํฌ์ฉ: | |
| <div> | |
| <a href="https://www.heroku.com/deploy?template=https://github.com/Byaidu/PDFMathTranslate"> | |
| <img src="https://www.herokucdn.com/deploy/button.svg" alt="Deploy" height="26"></a> | |
| <a href="https://render.com/deploy"> | |
| <img src="https://render.com/images/deploy-to-render-button.svg" alt="Deploy to Koyeb" height="26"></a> | |
| <a href="https://zeabur.com/templates/5FQIGX?referralCode=reycn"> | |
| <img src="https://zeabur.com/button.svg" alt="Deploy on Zeabur" height="26"></a> | |
| <a href="https://app.koyeb.com/deploy?type=git&builder=buildpack&repository=github.com/Byaidu/PDFMathTranslate&branch=main&name=pdf-math-translate"> | |
| <img src="https://www.koyeb.com/static/images/deploy/button.svg" alt="Deploy to Koyeb" height="26"></a> | |
| </div> | |
| <h2 id="usage">๊ณ ๊ธ ์ต์ </h2> | |
| ์ปค๋งจ๋๋ผ์ธ์์ ๋ฒ์ญ ๋ช ๋ น์ ์คํํ์ฌ ํ์ฌ ์์ ๋๋ ํ ๋ฆฌ์ ๋ฒ์ญ๋ ๋ฌธ์ `example-mono.pdf`์ ์ด์ค ์ธ์ด ๋ฌธ์ `example-dual.pdf`๋ฅผ ์์ฑํฉ๋๋ค. ๊ธฐ๋ณธ์ ์ผ๋ก Google ๋ฒ์ญ ์๋น์ค๋ฅผ ์ฌ์ฉํฉ๋๋ค. ๋ ๋ง์ ์ง์ ๋ฒ์ญ ์๋น์ค๋ [์ฌ๊ธฐ](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#services)์์ ์ฐพ์ ์ ์์ต๋๋ค. | |
| <img src="./images/cmd.explained.png" width="580px" alt="cmd"/> | |
| ๋ค์ ํ์ ์ฐธ๊ณ ์ฉ์ผ๋ก ๋ชจ๋ ๊ณ ๊ธ ์ต์ ์ ๋์ดํ์ต๋๋ค: | |
| | ์ต์ | ๊ธฐ๋ฅ | ์์ | | |
| | -------------- | ---------------------------------------------------------------------------------------------------------------- | ---------------------------------------------- | | |
| | files | ๋ก์ปฌ ํ์ผ | `pdf2zh ~/local.pdf` | | |
| | links | ์จ๋ผ์ธ ํ์ผ | `pdf2zh http://arxiv.org/paper.pdf` | | |
| | `-i` | [GUI ์ง์ ](#gui) | `pdf2zh -i` | | |
| | `-p` | [๋ถ๋ถ ๋ฌธ์ ๋ฒ์ญ](#partial) | `pdf2zh example.pdf -p 1` | | |
| | `-li` | [์์ค ์ธ์ด](#languages) | `pdf2zh example.pdf -li en` | | |
| | `-lo` | [๋์ ์ธ์ด](#languages) | `pdf2zh example.pdf -lo zh` | | |
| | `-s` | [๋ฒ์ญ ์๋น์ค](#services) | `pdf2zh example.pdf -s deepl` | | |
| | `-t` | [๋ฉํฐ์ค๋ ๋](#threads) | `pdf2zh example.pdf -t 1` | | |
| | `-o` | ์ถ๋ ฅ ๋๋ ํ ๋ฆฌ | `pdf2zh example.pdf -o output` | | |
| | `-f`, `-c` | [์์ธ](#exceptions) | `pdf2zh example.pdf -f "(MS.*)"` | | |
| | `--share` | [gradio ๊ณต๊ฐ ๋งํฌ ์ป๊ธฐ] | `pdf2zh -i --share` | | |
| | `--authorized` | [[์น ์ธ์ฆ ๋ฐ ์ฌ์ฉ์ ์ ์ ์ธ์ฆ ํ์ด์ง ์ถ๊ฐ](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.)] | `pdf2zh -i --authorized users.txt [auth.html]` | | |
| | `--prompt` | [์ฌ์ฉ์ ์ ์ ๋ํ ๋ชจ๋ธ ํ๋กฌํํธ ์ฌ์ฉ] | `pdf2zh --prompt [prompt.txt]` | | |
| | `--onnx` | [์ฌ์ฉ์ ์ ์ DocLayout-YOLO ONNX ๋ชจ๋ธ ์ฌ์ฉ] | `pdf2zh --onnx [onnx/model/path]` | | |
| | `--serverport` | [์ฌ์ฉ์ ์ ์ WebUI ํฌํธ ์ฌ์ฉ] | `pdf2zh --serverport 7860` | | |
| | `--dir` | [๋ฐฐ์น ๋ฒ์ญ] | `pdf2zh --dir /path/to/translate/` | | |
| | `--config` | [๊ตฌ์ฑ ํ์ผ](https://github.com/Byaidu/PDFMathTranslate/blob/main/docs/ADVANCED.md#cofig) | `pdf2zh --config /path/to/config/config.json` | | |
| <h3 id="partial">์ ์ฒด ๋๋ ๋ถ๋ถ ๋ฌธ์ ๋ฒ์ญ</h3> | |
| - **์ ์ฒด ๋ฒ์ญ** | |
| ```bash | |
| pdf2zh example.pdf | |
| ``` | |
| - **๋ถ๋ถ ๋ฒ์ญ** | |
| ```bash | |
| pdf2zh example.pdf -p 1-3,5 | |
| ``` | |
| <h3 id="language">์์ค ์ธ์ด์ ๋์ ์ธ์ด ์ง์ </h3> | |
| [Google Languages Codes](https://developers.google.com/admin-sdk/directory/v1/languages), [DeepL Languages Codes](https://developers.deepl.com/docs/resources/supported-languages) ์ฐธ์กฐ | |
| ```bash | |
| pdf2zh example.pdf -li en -lo ko | |
| ``` | |
| <h3 id="services">๋ค๋ฅธ ์๋น์ค๋ก ๋ฒ์ญ</h3> | |
| ๋ค์ ํ๋ ๊ฐ ๋ฒ์ญ ์๋น์ค์ ํ์ํ [ํ๊ฒฝ ๋ณ์](https://chatgpt.com/share/6734a83d-9d48-800e-8a46-f57ca6e8bcb4)๋ฅผ ๋ณด์ฌ์ค๋๋ค. ๊ฐ ์๋น์ค๋ฅผ ์ฌ์ฉํ๊ธฐ ์ ์ ์ด๋ฌํ ๋ณ์๋ฅผ ์ค์ ํ์ธ์. | |
| | **๋ฒ์ญ๊ธฐ** | **์๋น์ค** | **ํ๊ฒฝ ๋ณ์** | **๊ธฐ๋ณธ๊ฐ** | **์ฐธ๊ณ ** | | |
| | ------------------- | -------------- | --------------------------------------------------------------------- | -------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | |
| | **Google (๊ธฐ๋ณธ)** | `google` | ์์ | N/A | ์์ | | |
| | **Bing** | `bing` | ์์ | N/A | ์์ | | |
| | **DeepL** | `deepl` | `DEEPL_AUTH_KEY` | `[Your Key]` | [DeepL](https://support.deepl.com/hc/en-us/articles/360020695820-API-Key-for-DeepL-s-API) ์ฐธ์กฐ | | |
| | **DeepLX** | `deeplx` | `DEEPLX_ENDPOINT` | `https://api.deepl.com/translate` | [DeepLX](https://github.com/OwO-Network/DeepLX) ์ฐธ์กฐ | | |
| | **Ollama** | `ollama` | `OLLAMA_HOST`, `OLLAMA_MODEL` | `http://127.0.0.1:11434`, `gemma2` | [Ollama](https://github.com/ollama/ollama) ์ฐธ์กฐ | | |
| | **OpenAI** | `openai` | `OPENAI_BASE_URL`, `OPENAI_API_KEY`, `OPENAI_MODEL` | `https://api.openai.com/v1`, `[Your Key]`, `gpt-4o-mini` | [OpenAI](https://platform.openai.com/docs/overview) ์ฐธ์กฐ | | |
| | **AzureOpenAI** | `azure-openai` | `AZURE_OPENAI_BASE_URL`, `AZURE_OPENAI_API_KEY`, `AZURE_OPENAI_MODEL` | `[Your Endpoint]`, `[Your Key]`, `gpt-4o-mini` | [Azure OpenAI](https://learn.microsoft.com/zh-cn/azure/ai-services/openai/chatgpt-quickstart?tabs=command-line%2Cjavascript-keyless%2Ctypescript-keyless%2Cpython&pivots=programming-language-python) ์ฐธ์กฐ | | |
| | **Zhipu** | `zhipu` | `ZHIPU_API_KEY`, `ZHIPU_MODEL` | `[Your Key]`, `glm-4-flash` | [Zhipu](https://open.bigmodel.cn/dev/api/thirdparty-frame/openai-sdk) ์ฐธ์กฐ | | |
| | **ModelScope** | `modelscope` | `MODELSCOPE_API_KEY`, `MODELSCOPE_MODEL` | `[Your Key]`, `Qwen/Qwen2.5-Coder-32B-Instruct` | [ModelScope](https://www.modelscope.cn/docs/model-service/API-Inference/intro) ์ฐธ์กฐ | | |
| | **Silicon** | `silicon` | `SILICON_API_KEY`, `SILICON_MODEL` | `[Your Key]`, `Qwen/Qwen2.5-7B-Instruct` | [SiliconCloud](https://docs.siliconflow.cn/quickstart) ์ฐธ์กฐ | | |
| | **Gemini** | `gemini` | `GEMINI_API_KEY`, `GEMINI_MODEL` | `[Your Key]`, `gemini-1.5-flash` | [Gemini](https://ai.google.dev/gemini-api/docs/openai) ์ฐธ์กฐ | | |
| | **Azure** | `azure` | `AZURE_ENDPOINT`, `AZURE_API_KEY` | `https://api.translator.azure.cn`, `[Your Key]` | [Azure](https://docs.azure.cn/en-us/ai-services/translator/text-translation-overview) ์ฐธ์กฐ | | |
| | **Tencent** | `tencent` | `TENCENTCLOUD_SECRET_ID`, `TENCENTCLOUD_SECRET_KEY` | `[Your ID]`, `[Your Key]` | [Tencent](https://www.tencentcloud.com/products/tmt?from_qcintl=122110104) ์ฐธ์กฐ | | |
| | **Dify** | `dify` | `DIFY_API_URL`, `DIFY_API_KEY` | `[Your DIFY URL]`, `[Your Key]` | [Dify](https://github.com/langgenius/dify) ์ฐธ์กฐ, Dify์ ์ํฌํ๋ก์ฐ ์ ๋ ฅ์์ lang_out, lang_in, text ์ธ ๋ณ์๋ฅผ ์ ์ํด์ผ ํฉ๋๋ค. | | |
| | **AnythingLLM** | `anythingllm` | `AnythingLLM_URL`, `AnythingLLM_APIKEY` | `[Your AnythingLLM URL]`, `[Your Key]` | [anything-llm](https://github.com/Mintplex-Labs/anything-llm) ์ฐธ์กฐ | | |
| | **Argos Translate** | `argos` | | | [argos-translate](https://github.com/argosopentech/argos-translate) ์ฐธ์กฐ | | |
| | **Grok** | `grok` | `GORK_API_KEY`, `GORK_MODEL` | `[Your GORK_API_KEY]`, `grok-2-1212` | [Grok](https://docs.x.ai/docs/overview) ์ฐธ์กฐ | | |
| | **DeepSeek** | `deepseek` | `DEEPSEEK_API_KEY`, `DEEPSEEK_MODEL` | `[Your DEEPSEEK_API_KEY]`, `deepseek-chat` | [DeepSeek](https://www.deepseek.com/) ์ฐธ์กฐ | | |
| | **OpenAI-Liked** | `openailiked` | `OPENAILIKED_BASE_URL`, `OPENAILIKED_API_KEY`, `OPENAILIKED_MODEL` | `url`, `[Your Key]`, `model name` | ์์ | | |
| ์ ํ์ ์๋ OpenAI API์ ํธํ๋๋ ๋ํ ์ธ์ด ๋ชจ๋ธ์ ๊ฒฝ์ฐ, ํ์ OpenAI์ ๋์ผํ ๋ฐฉ์์ผ๋ก ํ๊ฒฝ ๋ณ์๋ฅผ ์ค์ ํ ์ ์์ต๋๋ค. | |
| `-s service` ๋๋ `-s service:model`์ ์ฌ์ฉํ์ฌ ๋ฒ์ญ ์๋น์ค๋ฅผ ์ง์ ํฉ๋๋ค: | |
| ```bash | |
| pdf2zh example.pdf -s openai:gpt-4o-mini | |
| ``` | |
| ๋๋ ํ๊ฒฝ ๋ณ์๋ก ๋ชจ๋ธ์ ์ง์ ํฉ๋๋ค: | |
| ```bash | |
| set OPENAI_MODEL=gpt-4o-mini | |
| pdf2zh example.pdf -s openai | |
| ``` | |
| PowerShell ์ฌ์ฉ์์ ๊ฒฝ์ฐ: | |
| ```shell | |
| $env:OPENAI_MODEL = gpt-4o-mini | |
| pdf2zh example.pdf -s openai | |
| ``` | |
| <h3 id="exceptions">์์ธ ์ง์ </h3> | |
| ์ ๊ท์์ ์ฌ์ฉํ์ฌ ๋ณด์กดํด์ผ ํ ์์ ํฐํธ์ ๋ฌธ์๋ฅผ ์ง์ ํฉ๋๋ค: | |
| ```bash | |
| pdf2zh example.pdf -f "(CM[^RT].*|MS.*|.*Ital)" -c "(\(|\||\)|\+|=|\d|[\u0080-\ufaff])" | |
| ``` | |
| ๊ธฐ๋ณธ์ ์ผ๋ก `Latex`, `Mono`, `Code`, `Italic`, `Symbol` ๋ฐ `Math` ํฐํธ๋ฅผ ๋ณด์กดํฉ๋๋ค: | |
| ```bash | |
| pdf2zh example.pdf -f "(CM[^R]|MS.M|XY|MT|BL|RM|EU|LA|RS|LINE|LCIRCLE|TeX-|rsfs|txsy|wasy|stmary|.*Mono|.*Code|.*Ital|.*Sym|.*Math)" | |
| ``` | |
| <h3 id="threads">์ค๋ ๋ ์ ์ง์ </h3> | |
| `-t`๋ฅผ ์ฌ์ฉํ์ฌ ๋ฒ์ญ์ ์ฌ์ฉํ ์ค๋ ๋ ์๋ฅผ ์ง์ ํฉ๋๋ค: | |
| ```bash | |
| pdf2zh example.pdf -t 1 | |
| ``` | |
| <h3 id="prompt">์ฌ์ฉ์ ์ ์ ํ๋กฌํํธ</h3> | |
| `--prompt`๋ฅผ ์ฌ์ฉํ์ฌ LLM์์ ์ฌ์ฉํ ํ๋กฌํํธ๋ฅผ ์ง์ ํฉ๋๋ค: | |
| ```bash | |
| pdf2zh example.pdf -pr prompt.txt | |
| ``` | |
| `prompt.txt` ์์: | |
| ```txt | |
| [ | |
| { | |
| "role": "system", | |
| "content": "You are a professional,authentic machine translation engine.", | |
| }, | |
| { | |
| "role": "user", | |
| "content": "Translate the following markdown source text to ${lang_out}. Keep the formula notation {{v*}} unchanged. Output translation directly without any additional text.\nSource Text: ${text}\nTranslated Text:", | |
| }, | |
| ] | |
| ``` | |
| ์ฌ์ฉ์ ์ ์ ํ๋กฌํํธ ํ์ผ์์๋ ๋ค์ ์ธ ๊ฐ์ง ๋ณ์๋ฅผ ์ฌ์ฉํ ์ ์์ต๋๋ค: | |
| | **๋ณ์** | **๋ด์ฉ** | | |
| | ---------- | ------------- | | |
| | `lang_in` | ์์ค ์ธ์ด | | |
| | `lang_out` | ๋์ ์ธ์ด | | |
| | `text` | ๋ฒ์ญํ ํ ์คํธ | | |
| <h2 id="todo">API</h2> | |
| ### Python | |
| ```python | |
| from pdf2zh import translate, translate_stream | |
| params = {"lang_in": "en", "lang_out": "ko", "service": "google", "thread": 4} | |
| file_mono, file_dual = translate(files=["example.pdf"], **params)[0] | |
| with open("example.pdf", "rb") as f: | |
| stream_mono, stream_dual = translate_stream(stream=f.read(), **params) | |
| ``` | |
| ### HTTP | |
| ```bash | |
| pip install pdf2zh[backend] | |
| pdf2zh --flask | |
| pdf2zh --celery worker | |
| ``` | |
| ```bash | |
| curl http://localhost:11008/v1/translate -F "file=@example.pdf" -F "data={\"lang_in\":\"en\",\"lang_out\":\"ko\",\"service\":\"google\",\"thread\":4}" | |
| {"id":"d9894125-2f4e-45ea-9d93-1a9068d2045a"} | |
| curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a | |
| {"info":{"n":13,"total":506},"state":"PROGRESS"} | |
| curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a | |
| {"state":"SUCCESS"} | |
| curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a/mono --output example-mono.pdf | |
| curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a/dual --output example-dual.pdf | |
| curl http://localhost:11008/v1/translate/d9894125-2f4e-45ea-9d93-1a9068d2045a -X DELETE | |
| ``` | |
| <h2 id="acknowledgement">๊ฐ์ฌ์ ๋ง</h2> | |
| - ๋ฌธ์ ๋ณํฉ: [PyMuPDF](https://github.com/pymupdf/PyMuPDF) | |
| - ๋ฌธ์ ํ์ฑ: [Pdfminer.six](https://github.com/pdfminer/pdfminer.six) | |
| - ๋ฌธ์ ์ถ์ถ: [MinerU](https://github.com/opendatalab/MinerU) | |
| - ๋ฌธ์ ๋ฏธ๋ฆฌ๋ณด๊ธฐ: [Gradio PDF](https://github.com/freddyaboulton/gradio-pdf) | |
| - ๋ฉํฐ์ค๋ ๋ ๋ฒ์ญ: [MathTranslate](https://github.com/SUSYUSTC/MathTranslate) | |
| - ๋ ์ด์์ ํ์ฑ: [DocLayout-YOLO](https://github.com/opendatalab/DocLayout-YOLO) | |
| - ๋ฌธ์ ํ์ค: [PDF Explained](https://zxyle.github.io/PDF-Explained/), [PDF Cheat Sheets](https://pdfa.org/resource/pdf-cheat-sheets/) | |
| - ๋ค๊ตญ์ด ํฐํธ: [Go Noto Universal](https://github.com/satbyy/go-noto-universal) | |
| <h2 id="contrib">๊ธฐ์ฌ์</h2> | |
| <a href="https://github.com/Byaidu/PDFMathTranslate/graphs/contributors"> | |
| <img src="https://opencollective.com/PDFMathTranslate/contributors.svg?width=890&button=false" /> | |
| </a> | |
|  | |
| <h2 id="star_hist">์คํ ํ์คํ ๋ฆฌ</h2> | |
| <a href="https://star-history.com/#Byaidu/PDFMathTranslate&Date"> | |
| <picture> | |
| <source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date&theme=dark" /> | |
| <source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date" /> | |
| <img alt="Star History Chart" src="https://api.star-history.com/svg?repos=Byaidu/PDFMathTranslate&type=Date"/> | |
| </picture> | |
| </a> | |