Update README.md
Browse files
README.md
CHANGED
|
@@ -18,9 +18,9 @@ tags:
|
|
| 18 |
|
| 19 |
**Try our demo available on [Demo](https://ocr.opentyphoon.ai/)**
|
| 20 |
|
| 21 |
-
**
|
| 22 |
|
| 23 |
-
**Blog available on Blog**
|
| 24 |
|
| 25 |
|
| 26 |
## **Real-World Document Support**
|
|
@@ -61,11 +61,29 @@ For this version, our primary focus has been on achieving high-quality OCR for b
|
|
| 61 |
## Usage Example
|
| 62 |
**(Recommended): Full inference code available on [Colab](https://colab.research.google.com/drive/1z4Fm2BZnKcFIoWuyxzzIIIn8oI2GKl3r?usp=sharing)**
|
| 63 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 64 |
Below is a partial snippet. You can run inference using either the API or a local model.
|
| 65 |
|
| 66 |
-
|
| 67 |
```python
|
| 68 |
from typing import Callable
|
|
|
|
|
|
|
|
|
|
| 69 |
|
| 70 |
PROMPTS_SYS = {
|
| 71 |
"default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
|
|
@@ -127,7 +145,7 @@ response = openai.chat.completions.create(
|
|
| 127 |
text_output = response.choices[0].message.content
|
| 128 |
print(text_output)
|
| 129 |
```
|
| 130 |
-
|
| 131 |
```python
|
| 132 |
# Initialize the model
|
| 133 |
model = Qwen2_5_VLForConditionalGeneration.from_pretrained("scb10x/typhoon-ocr-7b", torch_dtype=torch.bfloat16 ).eval()
|
|
@@ -167,7 +185,7 @@ print(text_output[0])
|
|
| 167 |
|
| 168 |
## **Intended Uses & Limitations**
|
| 169 |
|
| 170 |
-
This
|
| 171 |
|
| 172 |
## **Follow us**
|
| 173 |
|
|
|
|
| 18 |
|
| 19 |
**Try our demo available on [Demo](https://ocr.opentyphoon.ai/)**
|
| 20 |
|
| 21 |
+
**Code / Examples available on [Github](https://github.com/scb-10x/typhoon-ocr)**
|
| 22 |
|
| 23 |
+
**Release Blog available on [OpenTyphoon Blog](https://opentyphoon.ai/blog/en/typhoon-ocr-release)**
|
| 24 |
|
| 25 |
|
| 26 |
## **Real-World Document Support**
|
|
|
|
| 61 |
## Usage Example
|
| 62 |
**(Recommended): Full inference code available on [Colab](https://colab.research.google.com/drive/1z4Fm2BZnKcFIoWuyxzzIIIn8oI2GKl3r?usp=sharing)**
|
| 63 |
|
| 64 |
+
|
| 65 |
+
**(Recommended): Using Typhoon-OCR Package**
|
| 66 |
+
```bash
|
| 67 |
+
pip install typhoon-ocr
|
| 68 |
+
```
|
| 69 |
+
|
| 70 |
+
```
|
| 71 |
+
from typhoon_ocr import ocr_document
|
| 72 |
+
|
| 73 |
+
# please set env TYPHOON_OCR_API_KEY or OPENAI_API_KEY to use this function
|
| 74 |
+
markdown = ocr_document("test.png")
|
| 75 |
+
print(markdown)
|
| 76 |
+
```
|
| 77 |
+
**Run Manually**
|
| 78 |
+
|
| 79 |
Below is a partial snippet. You can run inference using either the API or a local model.
|
| 80 |
|
| 81 |
+
*API*:
|
| 82 |
```python
|
| 83 |
from typing import Callable
|
| 84 |
+
from openai import OpenAI
|
| 85 |
+
from PIL import Image
|
| 86 |
+
from typhoon_ocr.ocr_utils import render_pdf_to_base64png, get_anchor_text
|
| 87 |
|
| 88 |
PROMPTS_SYS = {
|
| 89 |
"default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
|
|
|
|
| 145 |
text_output = response.choices[0].message.content
|
| 146 |
print(text_output)
|
| 147 |
```
|
| 148 |
+
*Local Model (GPU Required)*:
|
| 149 |
```python
|
| 150 |
# Initialize the model
|
| 151 |
model = Qwen2_5_VLForConditionalGeneration.from_pretrained("scb10x/typhoon-ocr-7b", torch_dtype=torch.bfloat16 ).eval()
|
|
|
|
| 185 |
|
| 186 |
## **Intended Uses & Limitations**
|
| 187 |
|
| 188 |
+
This is a task-specific model intended to be used only with the provided prompts. It does not include any guardrails or VQA capability. Due to the nature of large language models (LLMs), a certain level of hallucination may occur. We recommend that developers carefully assess these risks in the context of their specific use case.
|
| 189 |
|
| 190 |
## **Follow us**
|
| 191 |
|