Update README.md
Browse files
README.md
CHANGED
|
@@ -3,13 +3,13 @@ license: apache-2.0
|
|
| 3 |
---
|
| 4 |
|
| 5 |
|
| 6 |
-
# syntheticbot/
|
| 7 |
|
| 8 |
|
| 9 |
|
| 10 |
## Introduction
|
| 11 |
|
| 12 |
-
syntheticbot/
|
| 13 |
|
| 14 |
#### Key Enhancements for OCR:
|
| 15 |
|
|
@@ -41,7 +41,7 @@ pip install git+https://github.com/huggingface/transformers accelerate
|
|
| 41 |
|
| 42 |
## Quickstart
|
| 43 |
|
| 44 |
-
The following examples illustrate the use of syntheticbot/
|
| 45 |
|
| 46 |
```
|
| 47 |
pip install git+https://github.com/huggingface/transformers accelerate
|
|
@@ -61,12 +61,12 @@ from qwen_vl_utils import process_vision_info
|
|
| 61 |
import torch
|
| 62 |
|
| 63 |
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
|
| 64 |
-
"syntheticbot/
|
| 65 |
torch_dtype="auto",
|
| 66 |
device_map="auto"
|
| 67 |
)
|
| 68 |
|
| 69 |
-
processor = AutoProcessor.from_pretrained("syntheticbot/
|
| 70 |
|
| 71 |
messages = [
|
| 72 |
{
|
|
@@ -114,11 +114,11 @@ import torch
|
|
| 114 |
import json
|
| 115 |
|
| 116 |
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
|
| 117 |
-
"syntheticbot/
|
| 118 |
torch_dtype="auto",
|
| 119 |
device_map="auto"
|
| 120 |
)
|
| 121 |
-
processor = AutoProcessor.from_pretrained("syntheticbot/
|
| 122 |
|
| 123 |
|
| 124 |
messages = [
|
|
@@ -215,7 +215,7 @@ print("Extracted Texts (Batch):\n", output_texts)
|
|
| 215 |
|
| 216 |
|
| 217 |
### 🤖 ModelScope
|
| 218 |
-
For users in mainland China, ModelScope is recommended. Use `snapshot_download` for checkpoint management. Adapt model names to `syntheticbot/
|
| 219 |
|
| 220 |
|
| 221 |
### More Usage Tips for OCR
|
|
@@ -223,7 +223,21 @@ For users in mainland China, ModelScope is recommended. Use `snapshot_download`
|
|
| 223 |
Input images support local files, URLs, and base64 encoding.
|
| 224 |
|
| 225 |
```python
|
| 226 |
-
messages = [
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 227 |
```
|
| 228 |
#### Image Resolution for OCR Accuracy
|
| 229 |
|
|
@@ -233,7 +247,7 @@ Higher resolution images typically improve OCR accuracy, especially for small te
|
|
| 233 |
min_pixels = 512 * 28 * 28
|
| 234 |
max_pixels = 2048 * 28 * 28
|
| 235 |
processor = AutoProcessor.from_pretrained(
|
| 236 |
-
"syntheticbot/
|
| 237 |
min_pixels=min_pixels, max_pixels=max_pixels
|
| 238 |
)
|
| 239 |
```
|
|
|
|
| 3 |
---
|
| 4 |
|
| 5 |
|
| 6 |
+
# syntheticbot/ocr-qwen
|
| 7 |
|
| 8 |
|
| 9 |
|
| 10 |
## Introduction
|
| 11 |
|
| 12 |
+
syntheticbot/ocr-qwen is a fine-tuned model for Optical Character Recognition (OCR) tasks, derived from the base model [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct). This model is engineered for high accuracy in extracting text from images, including documents and scenes containing text.
|
| 13 |
|
| 14 |
#### Key Enhancements for OCR:
|
| 15 |
|
|
|
|
| 41 |
|
| 42 |
## Quickstart
|
| 43 |
|
| 44 |
+
The following examples illustrate the use of syntheticbot/ocr-qwen with 🤗 Transformers and `qwen_vl_utils` for OCR applications.
|
| 45 |
|
| 46 |
```
|
| 47 |
pip install git+https://github.com/huggingface/transformers accelerate
|
|
|
|
| 61 |
import torch
|
| 62 |
|
| 63 |
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
|
| 64 |
+
"syntheticbot/ocr-qwen",
|
| 65 |
torch_dtype="auto",
|
| 66 |
device_map="auto"
|
| 67 |
)
|
| 68 |
|
| 69 |
+
processor = AutoProcessor.from_pretrained("syntheticbot/ocr-qwen")
|
| 70 |
|
| 71 |
messages = [
|
| 72 |
{
|
|
|
|
| 114 |
import json
|
| 115 |
|
| 116 |
model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
|
| 117 |
+
"syntheticbot/ocr-qwen",
|
| 118 |
torch_dtype="auto",
|
| 119 |
device_map="auto"
|
| 120 |
)
|
| 121 |
+
processor = AutoProcessor.from_pretrained("syntheticbot/ocr-qwen")
|
| 122 |
|
| 123 |
|
| 124 |
messages = [
|
|
|
|
| 215 |
|
| 216 |
|
| 217 |
### 🤖 ModelScope
|
| 218 |
+
For users in mainland China, ModelScope is recommended. Use `snapshot_download` for checkpoint management. Adapt model names to `syntheticbot/ocr-qwen` in ModelScope implementations.
|
| 219 |
|
| 220 |
|
| 221 |
### More Usage Tips for OCR
|
|
|
|
| 223 |
Input images support local files, URLs, and base64 encoding.
|
| 224 |
|
| 225 |
```python
|
| 226 |
+
messages = [
|
| 227 |
+
{
|
| 228 |
+
"role": "user",
|
| 229 |
+
"content": [
|
| 230 |
+
{
|
| 231 |
+
"type": "image",
|
| 232 |
+
"image": "http://path/to/your/document_image.jpg"
|
| 233 |
+
},
|
| 234 |
+
{
|
| 235 |
+
"type": "text",
|
| 236 |
+
"text": "Extract the text from this image URL."
|
| 237 |
+
},
|
| 238 |
+
],
|
| 239 |
+
}
|
| 240 |
+
]
|
| 241 |
```
|
| 242 |
#### Image Resolution for OCR Accuracy
|
| 243 |
|
|
|
|
| 247 |
min_pixels = 512 * 28 * 28
|
| 248 |
max_pixels = 2048 * 28 * 28
|
| 249 |
processor = AutoProcessor.from_pretrained(
|
| 250 |
+
"syntheticbot/ocr-qwen",
|
| 251 |
min_pixels=min_pixels, max_pixels=max_pixels
|
| 252 |
)
|
| 253 |
```
|