BBo09 commited on
Commit
a5a13f0
·
verified ·
1 Parent(s): 9666161

Upload 11 files

Browse files
Dockerfile ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM python:3.9
2
+
3
+ WORKDIR /code
4
+
5
+ COPY --link --chown=1000 . .
6
+
7
+ RUN pip install --no-cache-dir -r requirements.txt
8
+
9
+ RUN mkdir -p /tmp/cache/
10
+ RUN chmod a+rwx -R /tmp/cache/
11
+
12
+ RUN apt-get update && apt-get install -y poppler-utils tesseract-ocr chromium
13
+
14
+ ENV TRANSFORMERS_CACHE=/tmp/cache/
15
+ ENV PYTHONUNBUFFERED=1 GRADIO_ALLOW_FLAGGING=never GRADIO_NUM_PORTS=1 GRADIO_SERVER_NAME=0.0.0.0 GRADIO_SERVER_PORT=7860 SYSTEM=spaces
16
+
17
+ CMD ["python", "app.py"]
README.md CHANGED
@@ -1,12 +1,12 @@
1
  ---
2
- title: ASK Pdf Test
3
- emoji: 📚
4
- colorFrom: indigo
5
- colorTo: red
6
- sdk: gradio
7
- sdk_version: 5.1.0
8
- app_file: app.py
9
  pinned: false
10
- ---
11
-
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
1
  ---
2
+ tags:
3
+ - gradio-custom-component
4
+ - gradio-template-Fallback
5
+ title: Just Ask PDF
6
+ colorFrom: green
7
+ colorTo: blue
8
+ sdk: docker
9
  pinned: false
10
+ license: apache-2.0
11
+ emoji: ❓
12
+ ---
__init__.py ADDED
File without changes
app.py ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ import gradio as gr
3
+ from gradio_pdf import PDF
4
+ from pdf2image import convert_from_path
5
+ from transformers import pipeline
6
+ from pathlib import Path
7
+
8
+ dir_ = Path(__file__).parent
9
+
10
+ p = pipeline(
11
+ "document-question-answering",
12
+ model="impira/layoutlm-document-qa",
13
+ )
14
+
15
+ def qa(question: str, doc: str) -> str:
16
+ img = convert_from_path(doc)[0]
17
+ output = p(img, question)
18
+ return sorted(output, key=lambda x: x["score"], reverse=True)[0]['answer']
19
+
20
+
21
+ demo = gr.Interface(
22
+ qa,
23
+ [gr.Textbox(label="Question"), PDF(label="Document")],
24
+ gr.Textbox(),
25
+ examples=[["What is the total gross worth?", str(dir_ / "invoice_2.pdf")],
26
+ ["Whos is being invoiced?", str(dir_ / "sample_invoice.pdf")]]
27
+ )
28
+
29
+ demo.launch()
contract.pdf ADDED
Binary file (128 kB). View file
 
gradio_pdf-0.0.2-py3-none-any.whl ADDED
Binary file (304 kB). View file
 
gradio_pdf-0.0.3-py3-none-any.whl ADDED
Binary file (305 kB). View file
 
invoice_2.pdf ADDED
Binary file (372 kB). View file
 
requirements.txt ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ torch
2
+ transformers
3
+ pdf2image
4
+ pytesseract
5
+ gradio_pdf-0.0.3-py3-none-any.whl
sample_invoice.pdf ADDED
Binary file (34.7 kB). View file