Spaces:

sofzcc
/

Full_RAG_Assistant

Sleeping

App Files Files Community

sofzcc commited on Dec 2, 2025

Commit

10c7860

verified ·

1 Parent(s): 456f6e2

Update README.md

Browse files

Files changed (1) hide show

README.md +127 -12

README.md CHANGED Viewed

@@ -1,12 +1,127 @@
----
-title: Full RAG Assistant
-emoji: 💻
-colorFrom: purple
-colorTo: gray
-sdk: gradio
-sdk_version: 6.0.1
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+📚 RAG AI Assistant — Document-Aware Chatbot Powered by Retrieval-Augmented Generation
+Welcome to the RAG AI Assistant, a lightweight, open-source document question-answering system that lets you chat with your own knowledge base.
+Upload documents → Rebuild the index → Ask questions → Get grounded, explainable answers sourced from your files.
+Built with:
+FAISS for fast vector search
+Sentence Transformers for embeddings
+Transformers QA pipeline for extraction
+Gradio Blocks UI for clean chat + KB management
+🚀 Features
+🧠 Retrieval-Augmented Generation (RAG)
+The assistant retrieves relevant document chunks and uses a QA model to produce accurate, grounded answers.
+📂 Knowledge Base Uploads
+You can upload your own documents directly inside the Space:
+Supported formats:
+txt, md, pdf, docx, doc
+⚙️ Rebuildable FAISS Index
+After uploading files, click Rebuild index to update the vector store instantly — no need to restart the Space.
+💬 Interactive Chat
+Ask free-form questions about your uploaded documents.
+The model will:
+Retrieve relevant context
+Extract answers
+Show confidence scores
+Cite the source document
+🔍 Full Transparency
+If no answer is found, you’ll receive helpful suggestions and context previews.
+🛠️ How It Works
+1. Chunking
+Documents are split into overlapping chunks:
+Size: 500 characters
+Overlap: 50 characters
+2. Embedding
+Each chunk is embedded with:
+sentence-transformers/all-MiniLM-L6-v2
+3. Vector Search
+FAISS (IndexFlatIP) finds the closest matches using cosine similarity.
+4. Answer Extraction
+A QA model extracts precise answers:
+deepset/roberta-base-squad2
+🧪 Usage Instructions
+1. Upload Your Documents
+Go to the Knowledge Base tab and upload as many files as you want.
+2. Rebuild the Index
+Click Rebuild index to process the files and generate embeddings.
+3. Start Asking Questions
+Switch to the Chat tab and ask questions like:
+What is the main topic of the report?
+Summarize the key findings.
+What does section 3 say about metrics?
+📁 Project Structure
+├── app.py               # Main application logic
+├── config.yaml          # Optional configuration file
+├── knowledge_base/      # User-uploaded documents
+├── index/               # Saved FAISS index + metadata
+└── README.md            # This file
+🧩 Tech Stack
+Python 3.10
+Gradio
+FAISS
+Sentence Transformers
+Transformers (HuggingFace)
+PyPDF2 / python-docx
+🧱 Roadmap (Upcoming Enhancements)
+🔄 Streaming responses (LLM-style typing)
+📊 Document preview inside the chat
+📝 Source highlighting of extracted spans
+🎨 Theming + cleaner chat UI
+⚡ Optional lightweight QA model for faster inference
+Feel free to suggest improvements or contribute!