openai chardet pdfminer2 backoff pandas openpyxl PyPDF2 spire.doc