Spaces:
Runtime error
Runtime error
File size: 1,917 Bytes
a7db4c8 e9eef0f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 |
---
title: My API Space
emoji: 🔧
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: "5.0.0"
app_file: app.py
pinned: false
---
## Typhoon OCR
Typhoon OCR is a model for extracting structured markdown from images or PDFs. It supports document layout analysis and table extraction, returning results in markdown or HTML. This package is a simple Gradio website to demonstrate the performance of Typhoon OCR.
### Features
- Upload a PDF or image (single page)
- Extracts and reconstructs document content as markdown
- Supports different prompt modes for layout or structure
- Language: English, Thai
- Uses a local or remote OpenAI-compatible API (e.g., vllm, opentyphoon.ai)
- See blog for more detail https://opentyphoon.ai/blog/en/typhoon-ocr-release
### Requirements
- Linux / Mac with python (window not supported at the moment)
### Install
```bash
pip install typhoon-ocr
```
or to run the gradio app.
```bash
pip install -r requirements.txt
# edit .env
# pip install vllm # optional for hosting a local server
```
### Mac specific
```
brew install poppler
# The following binaries are required and provided by poppler:
# - pdfinfo
# - pdftoppm
```
### Linux specific
```
sudo apt-get update
sudo apt-get install poppler-utils
# The following binaries are required and provided by poppler-utils:
# - pdfinfo
# - pdftoppm
```
### Start vllm
```bash
vllm serve scb10x/typhoon-ocr-7b --served-model-name typhoon-ocr --dtype bfloat16 --port 8101
```
### Run Gradio demo
```bash
python app.py
```
### Dependencies
- openai
- python-dotenv
- ftfy
- pypdf
- gradio
- vllm (for hosting an inference server)
- pillow
### Debug
- If `Error processing document` occur. Make sure you have install `brew install poppler` or `apt-get install poppler-utils`.
### License
This project is licensed under the Apache 2.0 License. See individual datasets and checkpoints for their respective licenses.
|