rag-document-qa / README.md
Nav772's picture
Upload README.md with huggingface_hub
e80959b verified

A newer version of the Gradio SDK is available: 6.3.0

Upgrade
metadata
title: RAG Document Q&A System
emoji: πŸ“š
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: mit

πŸ“š RAG Document Q&A System

A Retrieval-Augmented Generation (RAG) system that answers questions about uploaded PDF documents.

🎯 What This Does

  1. Upload a PDF document
  2. Process the document (chunks it and creates embeddings)
  3. Ask questions about the document
  4. Get accurate answers with source citations

πŸ—οΈ Architecture

User Question β†’ Embedding β†’ Vector Search β†’ Retrieved Chunks β†’ LLM β†’ Answer
Component Technology
Embeddings sentence-transformers/all-MiniLM-L6-v2 (384 dimensions)
Vector Store FAISS (Facebook AI Similarity Search)
Text Splitter RecursiveCharacterTextSplitter (1000 chars, 200 overlap)
LLM HuggingFaceH4/zephyr-7b-beta via Inference API
Framework LangChain + Gradio

πŸ› οΈ Development Challenges

This project encountered several technical challenges during development:

Challenge 1: LangChain API Changes

Problem: Import errors due to LangChain's package restructuring.

# Old (broken)
from langchain.document_loaders import PyPDFLoader
from langchain.chains import RetrievalQA

# New (working)
from langchain_community.document_loaders import PyPDFLoader
# RetrievalQA deprecated β†’ use LCEL chains instead

Lesson: Fast-evolving libraries require checking current documentation.

Challenge 2: PDF Download Issues

Problem: PdfStreamError: Stream has ended unexpectedly Cause: Incomplete download due to missing User-Agent header. Solution: Added proper headers to HTTP request.

Challenge 3: LLM Response Quality

Problem: FLAN-T5-Large produced fragment-like responses instead of complete answers. Attempted Solutions:

  1. Adjusted generation parameters β€” minimal improvement
  2. Modified prompt format β€” slight improvement
  3. Switched to FLAN-T5-XL β€” OOM error

Final Solution: Switched to Zephyr-7B-beta, which produces comprehensive answers.

πŸ“ Limitations

  • Only processes PDF documents
  • English language only
  • Free Inference API has rate limits

πŸ‘€ Author

Nav772 - Built as part of AI Engineering portfolio

πŸ“š Related Projects