gradio pytesseract pdf2image PyPDF2 Pillow python-docx