Emeritus-21 commited on
Commit
69eb4cf
·
verified ·
1 Parent(s): e65f65b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -22
README.md CHANGED
@@ -2,47 +2,47 @@
2
  language: en
3
  tags:
4
  - handwriting-recognition
5
- - vision
6
- - text-recognition
7
- - pytorch
8
  - image-to-text
9
- - document-processing
 
10
  license: mit
11
  pipeline_tag: image-to-text
12
  library_name: transformers
13
  ---
14
 
15
- # 🖋️ Finetuned Full HTR Model
16
 
17
- This is a finetuned **Handwritten Text Recognition (HTR)** model trained to accurately recognize handwritten English text from scanned images or documents.
18
 
19
- ## Features
20
 
21
- - 📸 Input: Handwritten image
22
- - 🔤 Output: Recognized text
23
- - 🧠 Model: VisionEncoderDecoder (TrOCR-style architecture)
24
- - 🔧 Framework: Hugging Face Transformers
25
 
26
- ## 🧪 Usage
27
 
28
  ```python
29
- from transformers import VisionEncoderDecoderModel, AutoProcessor
30
  from PIL import Image
31
  import torch
32
 
33
- # Load model and processor
34
- model = VisionEncoderDecoderModel.from_pretrained("Emeritus-21/Finetuned-full-HTR-model")
35
- processor = AutoProcessor.from_pretrained("Emeritus-21/Finetuned-full-HTR-model")
36
 
37
  device = "cuda" if torch.cuda.is_available() else "cpu"
38
  model = model.to(device)
39
 
40
- # Load and preprocess image
41
  image = Image.open("your_image.jpg").convert("RGB")
42
- inputs = processor(images=image, return_tensors="pt").pixel_values.to(device)
43
 
44
- # Generate text
45
- generated_ids = model.generate(inputs)
46
- text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
47
 
48
- print("Recognized Text:", text)
 
2
  language: en
3
  tags:
4
  - handwriting-recognition
5
+ - vision2seq
6
+ - qwen
 
7
  - image-to-text
8
+ - htr
9
+ - tensorflow
10
  license: mit
11
  pipeline_tag: image-to-text
12
  library_name: transformers
13
  ---
14
 
15
+ # 🖋️ Finetuned Full HTR Model (Qwen-based)
16
 
17
+ This is a **Qwen Vision2Seq** model fine-tuned for **Handwritten Text Recognition (HTR)**. It reads handwritten text from images and generates clean, editable output using advanced transformer-based image-to-text techniques.
18
 
19
+ ## 🔍 Model Summary
20
 
21
+ - **Model Architecture**: Qwen-Vision2Seq (Image encoder + Language decoder)
22
+ - **Framework**: TensorFlow (via Hugging Face Transformers)
23
+ - **Input**: Handwritten text image
24
+ - **Output**: Recognized plain text
25
 
26
+ ## 🧠 How to Use (with Hugging Face Transformers)
27
 
28
  ```python
29
+ from transformers import AutoProcessor, AutoModelForVision2Seq
30
  from PIL import Image
31
  import torch
32
 
33
+ # Load processor and model
34
+ processor = AutoProcessor.from_pretrained("Emeritus-21/Finetuned-full-HTR-model", trust_remote_code=True)
35
+ model = AutoModelForVision2Seq.from_pretrained("Emeritus-21/Finetuned-full-HTR-model", trust_remote_code=True)
36
 
37
  device = "cuda" if torch.cuda.is_available() else "cpu"
38
  model = model.to(device)
39
 
40
+ # Load and process image
41
  image = Image.open("your_image.jpg").convert("RGB")
42
+ inputs = processor(images=image, return_tensors="pt").to(device)
43
 
44
+ # Generate prediction
45
+ generated_ids = model.generate(**inputs)
46
+ recognized_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
47
 
48
+ print("📝 Recognized Text:", recognized_text)