Text Generation
Transformers
Safetensors
DIVEdoc
docvqa
distillation
VLM
document-understanding
OCR-free
custom_code
JayRay5 commited on
Commit
cb71bf4
·
verified ·
1 Parent(s): a5223ff

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -2
README.md CHANGED
@@ -6,6 +6,12 @@ datasets:
6
  - pixparse/docvqa-single-page-questions
7
  spaces:
8
  - JayRay5/DIVE-Doc-docvqa
 
 
 
 
 
 
9
  ---
10
 
11
  ## 1 Introduction
@@ -37,9 +43,30 @@ Trained on the [DocVQA dataset](https://openaccess.thecvf.com/content/WACV2021/h
37
 
38
  #### From the Transformers library
39
  ```bash
40
- from transformers import AutoModelForCausalLM
 
 
41
 
42
- AutoModelForCausalLM.from_pretrained("JayRay5/DIVE-Doc-FRD",trust_remote_code=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ```
44
 
45
  #### From the GitHub repository
 
6
  - pixparse/docvqa-single-page-questions
7
  spaces:
8
  - JayRay5/DIVE-Doc-docvqa
9
+ tags:
10
+ - docvqa
11
+ - distillation
12
+ - VLM
13
+ - document-understanding
14
+ - OCR-free
15
  ---
16
 
17
  ## 1 Introduction
 
43
 
44
  #### From the Transformers library
45
  ```bash
46
+ from transformers import AutoProcessor, AutoModelForCausalLM
47
+ from PIL import Image
48
+ import torch
49
 
50
+ processor = AutoProcessor.from_pretrained("JayRay5/DIVE-Doc-FRD", trust_remote_code=True)
51
+ model = AutoModelForCausalLM.from_pretrained("JayRay5/DIVE-Doc-FRD", trust_remote_code=True)
52
+
53
+ image = Image.open("your_image_document_path/image_document.png").convert("RGB")
54
+ question_example = "What the the name of the author"
55
+
56
+ inputs = (
57
+ processor(text=question_example, images=image, return_tensors="pt", padding=True)
58
+ .to(model.device)
59
+ .to(model.dtype)
60
+ )
61
+ input_length = inputs["input_ids"].shape[-1]
62
+
63
+ with torch.inference_mode():
64
+ output_ids = model.generate(**inputs, max_new_tokens=100, do_sample=False)
65
+
66
+ generated_ids = output_ids[0][input_length:]
67
+ answer = processor.decode(generated_ids, skip_special_tokens=True)
68
+
69
+ print(answer)
70
  ```
71
 
72
  #### From the GitHub repository