syntheticbot
/

ocr-qwen

Safetensors

qwen2_5_vl

Model card Files Files and versions

xet

Community

syntheticbot commited on Mar 7, 2025

Commit

eafad19

verified ·

1 Parent(s): 99c27b9

Update README.md

Browse files

Files changed (1) hide show

README.md +24 -10

README.md CHANGED Viewed

@@ -3,13 +3,13 @@ license: apache-2.0
 ---
-# syntheticbot/Qwen-VL-7B-ocr
 ## Introduction
-syntheticbot/Qwen-VL-7B-ocr is a fine-tuned model for Optical Character Recognition (OCR) tasks, derived from the base model [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct). This model is engineered for high accuracy in extracting text from images, including documents and scenes containing text.
 #### Key Enhancements for OCR:
@@ -41,7 +41,7 @@ pip install git+https://github.com/huggingface/transformers accelerate
 ## Quickstart
-The following examples illustrate the use of syntheticbot/Qwen-VL-7B-ocr with 🤗 Transformers and `qwen_vl_utils` for OCR applications.
 ```
 pip install git+https://github.com/huggingface/transformers accelerate
@@ -61,12 +61,12 @@ from qwen_vl_utils import process_vision_info
 import torch
 model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
-    "syntheticbot/Qwen-VL-7B-ocr",
     torch_dtype="auto",
     device_map="auto"
 )
-processor = AutoProcessor.from_pretrained("syntheticbot/Qwen-VL-7B-ocr")
 messages = [
     {
@@ -114,11 +114,11 @@ import torch
 import json
 model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
-    "syntheticbot/Qwen-VL-7B-ocr",
     torch_dtype="auto",
     device_map="auto"
 )
-processor = AutoProcessor.from_pretrained("syntheticbot/Qwen-VL-7B-ocr")
 messages = [
@@ -215,7 +215,7 @@ print("Extracted Texts (Batch):\n", output_texts)
 ### 🤖 ModelScope
-For users in mainland China, ModelScope is recommended. Use `snapshot_download` for checkpoint management.  Adapt model names to `syntheticbot/Qwen-VL-7B-ocr` in ModelScope implementations.
 ### More Usage Tips for OCR
@@ -223,7 +223,21 @@ For users in mainland China, ModelScope is recommended. Use `snapshot_download`
 Input images support local files, URLs, and base64 encoding.
 ```python
-messages = [    {        "role": "user",        "content": [            {"type": "image", "image": "http://path/to/your/document_image.jpg"},            {"type": "text", "text": "Extract the text from this image URL."},        ],    }]
 ```
 #### Image Resolution for OCR Accuracy
@@ -233,7 +247,7 @@ Higher resolution images typically improve OCR accuracy, especially for small te
 min_pixels = 512 * 28 * 28
 max_pixels = 2048 * 28 * 28
 processor = AutoProcessor.from_pretrained(
-    "syntheticbot/Qwen-VL-7B-ocr",
     min_pixels=min_pixels, max_pixels=max_pixels
 )
 ```

 ---
+# syntheticbot/ocr-qwen
 ## Introduction
+syntheticbot/ocr-qwen is a fine-tuned model for Optical Character Recognition (OCR) tasks, derived from the base model [Qwen/Qwen2.5-VL-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct). This model is engineered for high accuracy in extracting text from images, including documents and scenes containing text.
 #### Key Enhancements for OCR:
 ## Quickstart
+The following examples illustrate the use of syntheticbot/ocr-qwen with 🤗 Transformers and `qwen_vl_utils` for OCR applications.
 ```
 pip install git+https://github.com/huggingface/transformers accelerate
 import torch
 model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
+    "syntheticbot/ocr-qwen",
     torch_dtype="auto",
     device_map="auto"
 )
+processor = AutoProcessor.from_pretrained("syntheticbot/ocr-qwen")
 messages = [
     {
 import json
 model = Qwen2_5_VLForConditionalGeneration.from_pretrained(
+    "syntheticbot/ocr-qwen",
     torch_dtype="auto",
     device_map="auto"
 )
+processor = AutoProcessor.from_pretrained("syntheticbot/ocr-qwen")
 messages = [
 ### 🤖 ModelScope
+For users in mainland China, ModelScope is recommended. Use `snapshot_download` for checkpoint management.  Adapt model names to `syntheticbot/ocr-qwen` in ModelScope implementations.
 ### More Usage Tips for OCR
 Input images support local files, URLs, and base64 encoding.
 ```python
+messages = [
+    {
+        "role": "user",
+        "content": [
+            {
+                "type": "image",
+                "image": "http://path/to/your/document_image.jpg"
+            },
+            {
+                "type": "text",
+                "text": "Extract the text from this image URL."
+            },
+        ],
+    }
+]
 ```
 #### Image Resolution for OCR Accuracy
 min_pixels = 512 * 28 * 28
 max_pixels = 2048 * 28 * 28
 processor = AutoProcessor.from_pretrained(
+    "syntheticbot/ocr-qwen",
     min_pixels=min_pixels, max_pixels=max_pixels
 )
 ```