Full_RAG_Assistant / README.md
sofzcc's picture
Update README.md
8af9350 verified

A newer version of the Gradio SDK is available: 6.5.1

Upgrade
metadata
title: Full RAG Assistant
emoji: πŸ’»
colorFrom: purple
colorTo: gray
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: false

πŸ“š RAG AI Assistant β€” Document-Aware Chatbot Powered by Retrieval-Augmented Generation

Welcome to the RAG AI Assistant, a lightweight, open-source document question-answering system that lets you chat with your own knowledge base.

Upload documents β†’ Rebuild the index β†’ Ask questions β†’ Get grounded, explainable answers sourced from your files.

Built with:

FAISS for fast vector search

Sentence Transformers for embeddings

Transformers QA pipeline for extraction

Gradio Blocks UI for clean chat + KB management

πŸš€ Features 🧠 Retrieval-Augmented Generation (RAG)

The assistant retrieves relevant document chunks and uses a QA model to produce accurate, grounded answers.

πŸ“‚ Knowledge Base Uploads

You can upload your own documents directly inside the Space:

Supported formats:

txt, md, pdf, docx, doc

βš™οΈ Rebuildable FAISS Index

After uploading files, click Rebuild index to update the vector store instantly β€” no need to restart the Space.

πŸ’¬ Interactive Chat

Ask free-form questions about your uploaded documents.

The model will:

Retrieve relevant context

Extract answers

Show confidence scores

Cite the source document

πŸ” Full Transparency

If no answer is found, you’ll receive helpful suggestions and context previews.

πŸ› οΈ How It Works

  1. Chunking

Documents are split into overlapping chunks:

Size: 500 characters

Overlap: 50 characters

  1. Embedding

Each chunk is embedded with:

sentence-transformers/all-MiniLM-L6-v2

  1. Vector Search

FAISS (IndexFlatIP) finds the closest matches using cosine similarity.

  1. Answer Extraction

A QA model extracts precise answers:

deepset/roberta-base-squad2

πŸ§ͺ Usage Instructions

  1. Upload Your Documents

Go to the Knowledge Base tab and upload as many files as you want.

  1. Rebuild the Index

Click Rebuild index to process the files and generate embeddings.

  1. Start Asking Questions

Switch to the Chat tab and ask questions like:

What is the main topic of the report? Summarize the key findings. What does section 3 say about metrics?

πŸ“ Project Structure β”œβ”€β”€ app.py # Main application logic β”œβ”€β”€ config.yaml # Optional configuration file β”œβ”€β”€ knowledge_base/ # User-uploaded documents β”œβ”€β”€ index/ # Saved FAISS index + metadata └── README.md # This file

🧩 Tech Stack

Python 3.10

Gradio

FAISS

Sentence Transformers

Transformers (HuggingFace)

PyPDF2 / python-docx

🧱 Roadmap (Upcoming Enhancements)

πŸ”„ Streaming responses (LLM-style typing)

πŸ“Š Document preview inside the chat

πŸ“ Source highlighting of extracted spans

🎨 Theming + cleaner chat UI

⚑ Optional lightweight QA model for faster inference

Feel free to suggest improvements or contribute!