kunato commited on
Commit
b8b808b
·
verified ·
1 Parent(s): 38e08db

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +23 -5
README.md CHANGED
@@ -18,9 +18,9 @@ tags:
18
 
19
  **Try our demo available on [Demo](https://ocr.opentyphoon.ai/)**
20
 
21
- **Github available on [Github](https://github.com/scb-10x/typhoon-ocr)**
22
 
23
- **Blog available on Blog**
24
 
25
 
26
  ## **Real-World Document Support**
@@ -61,11 +61,29 @@ For this version, our primary focus has been on achieving high-quality OCR for b
61
  ## Usage Example
62
  **(Recommended): Full inference code available on [Colab](https://colab.research.google.com/drive/1z4Fm2BZnKcFIoWuyxzzIIIn8oI2GKl3r?usp=sharing)**
63
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
64
  Below is a partial snippet. You can run inference using either the API or a local model.
65
 
66
- **API**:
67
  ```python
68
  from typing import Callable
 
 
 
69
 
70
  PROMPTS_SYS = {
71
  "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
@@ -127,7 +145,7 @@ response = openai.chat.completions.create(
127
  text_output = response.choices[0].message.content
128
  print(text_output)
129
  ```
130
- **Local Model (GPU Required)**:
131
  ```python
132
  # Initialize the model
133
  model = Qwen2_5_VLForConditionalGeneration.from_pretrained("scb10x/typhoon-ocr-7b", torch_dtype=torch.bfloat16 ).eval()
@@ -167,7 +185,7 @@ print(text_output[0])
167
 
168
  ## **Intended Uses & Limitations**
169
 
170
- This model is an instructional model. However, it’s still undergoing development. It incorporates some level of guardrails, but it still may produce answers that are inaccurate, biased, or otherwise objectionable in response to user prompts. We recommend that developers assess these risks in the context of their use case.
171
 
172
  ## **Follow us**
173
 
 
18
 
19
  **Try our demo available on [Demo](https://ocr.opentyphoon.ai/)**
20
 
21
+ **Code / Examples available on [Github](https://github.com/scb-10x/typhoon-ocr)**
22
 
23
+ **Release Blog available on [OpenTyphoon Blog](https://opentyphoon.ai/blog/en/typhoon-ocr-release)**
24
 
25
 
26
  ## **Real-World Document Support**
 
61
  ## Usage Example
62
  **(Recommended): Full inference code available on [Colab](https://colab.research.google.com/drive/1z4Fm2BZnKcFIoWuyxzzIIIn8oI2GKl3r?usp=sharing)**
63
 
64
+
65
+ **(Recommended): Using Typhoon-OCR Package**
66
+ ```bash
67
+ pip install typhoon-ocr
68
+ ```
69
+
70
+ ```
71
+ from typhoon_ocr import ocr_document
72
+
73
+ # please set env TYPHOON_OCR_API_KEY or OPENAI_API_KEY to use this function
74
+ markdown = ocr_document("test.png")
75
+ print(markdown)
76
+ ```
77
+ **Run Manually**
78
+
79
  Below is a partial snippet. You can run inference using either the API or a local model.
80
 
81
+ *API*:
82
  ```python
83
  from typing import Callable
84
+ from openai import OpenAI
85
+ from PIL import Image
86
+ from typhoon_ocr.ocr_utils import render_pdf_to_base64png, get_anchor_text
87
 
88
  PROMPTS_SYS = {
89
  "default": lambda base_text: (f"Below is an image of a document page along with its dimensions. "
 
145
  text_output = response.choices[0].message.content
146
  print(text_output)
147
  ```
148
+ *Local Model (GPU Required)*:
149
  ```python
150
  # Initialize the model
151
  model = Qwen2_5_VLForConditionalGeneration.from_pretrained("scb10x/typhoon-ocr-7b", torch_dtype=torch.bfloat16 ).eval()
 
185
 
186
  ## **Intended Uses & Limitations**
187
 
188
+ This is a task-specific model intended to be used only with the provided prompts. It does not include any guardrails or VQA capability. Due to the nature of large language models (LLMs), a certain level of hallucination may occur. We recommend that developers carefully assess these risks in the context of their specific use case.
189
 
190
  ## **Follow us**
191