openai streamlit PyPDF2 pandas tqdm docling anthropic langchain-text-splitters langchain-community chromadb tiktoken