Spaces:
Sleeping
title: RAG From Scratch
emoji: 📚
colorFrom: yellow
colorTo: blue
sdk: docker
pinned: false
license: mit
RAG from Scratch
Question
What actually happens inside a retrieval-augmented generation system?
System Boundary
This Space keeps the pipeline visible: PDF parsing, chunking, embedding, vector search, context assembly, and answer generation. The point is not to wrap RAG in an agent framework; the point is to expose the mechanics.
Method
Uploaded PDFs are split into overlapping text chunks. Each chunk is converted into a lightweight lexical vector, and a user question retrieves the closest passages by cosine similarity over term counts. The language model receives only the retrieved context and is asked to answer with source awareness.
Technique
Retrieval-augmented generation separates memory from generation. Instead of asking the model to answer from its parameters alone, the system first searches an external corpus and then conditions the model on the retrieved evidence.
The important design choices are chunk size, overlap, retrieval representation, distance metric, number of retrieved chunks, and prompt format. Each one changes the final answer quality.
Output
The app returns an answer, the retrieved chunks, similarity scores, and source names.
Why It Matters
Most RAG failures are retrieval failures disguised as generation failures. This demo makes retrieval inspectable.
What To Notice
If the retrieved chunks are weak, the generated answer will be weak even if the language model is strong. The retrieved evidence is therefore the first object to debug.
Effect In Practice
RAG lets teams build assistants over private or changing documents without fine-tuning the model every time the knowledge base changes.
Hugging Face Extension
This Space can grow into a full retrieval benchmark by publishing example documents, queries, expected citations, and answer-quality labels as a Hugging Face Dataset.
Limitations
The app uses lexical retrieval so it stays reliable on small CPU Spaces. Production systems should add embeddings, document metadata, reranking, evaluation sets, and hallucination checks.
Run Locally
pip install -r requirements.txt
python app.py