Spaces:

maclenn77
/

pdf-explainer

Runtime error

Juan Paulo Pérez-Tejada commited on Dec 8, 2023

Commit

7a95605

unverified ·

1 Parent(s): f681f38

Add extract pdf content function (#3)

Files changed (4) hide show

README.md CHANGED Viewed

@@ -14,9 +14,15 @@ An Intelligent Assistant that explains you the content of a PDF file
 ## Deployment
-Deploy in HF with Streamlit
 ## Stack
 - Streamlit
 - HuggingFace

 ## Deployment
+Deploy in HF with Streamlit-
+## Local
+Run streamlit run app.py
 ## Stack
 - Streamlit
 - HuggingFace
+- Tika: For extracting pdf text
+- Java Runtime

app.py CHANGED Viewed

@@ -1,5 +1,12 @@
 """ A simple example of Streamlit. """
 import streamlit as st
-x = st.slider("Select a value")
-st.write(x, "squared is", x * x)

 """ A simple example of Streamlit. """
 import streamlit as st
+from tika import parser
+pdf = st.file_uploader("Upload a file", type="pdf")
+if st.button("Extract text"):
+    if pdf is not None:
+        extracted_text = parser.from_file(pdf)
+        st.write(extracted_text["content"])
+    else:
+        st.write("Please upload a file of type: pdf")

requirements.txt CHANGED Viewed

@@ -1,6 +1,6 @@
 openai
 langchain
-pdfminer
 chromadb
 sentence_transformers
 streamlit

 openai
 langchain
+tika
 chromadb
 sentence_transformers
 streamlit

wk_flow_requirements.txt CHANGED Viewed

@@ -1,2 +1,3 @@
 streamlit
 pylint

 streamlit
+tika
 pylint