Spaces:

Mohamed2210
/

PDF-Rag-System

Configuration error

PDF-Rag-System / README.md

Upload README.md

ff46f8d verified about 1 month ago

2.22 kB

📚 PDF Q&A with Hybrid Search + LLM

This project is a Question Answering (QA) system that allows users to:

Upload a PDF document.
Automatically process and chunk the text.
Store embeddings in Qdrant Vector Database and build a hybrid retriever (BM25 + Qdrant).
Ask natural language questions, and the model will retrieve the relevant context from the PDF and generate an answer using a Large Language Model (LLM).

It combines semantic search (dense) + keyword search (BM25) for better retrieval accuracy.

LangChain → Orchestration of retrievers and chains.
HuggingFace + Together API → LLM endpoint (Qwen3-235B-A22B-Instruct-2507).
Qdrant → Vector database for storing embeddings.
BM25 → Keyword-based retriever.
Docling → Loader to extract text from PDF into Markdown.
Transformers → Tokenizer for chunking text.
Gradio → Web interface.
dotenv → Secure API key management.

Upload PDF
- The file is loaded with DoclingLoader.
- Text is split into chunks using HuggingFace tokenizer.
Build Hybrid Search
- Embeddings are created using sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2.
- Chunks are stored in Qdrant.
- Dense retriever (embeddings) + BM25 retriever (keywords) are combined with weights 0.6 (dense) and 0.4 (BM25).
Ask Questions
- User writes a question.
- Relevant chunks are retrieved.
- A prompt is built with context + question.
- The LLM generates the answer (max 3 sentences).

Upload any PDF document.
Hybrid search ensures more accurate retrieval than only embeddings or BM25.
Context-aware Q&A answers.
Caching retriever so you only upload once (no need to re-process for every question).
Simple Gradio UI with upload + question box.