File size: 2,917 Bytes
8af9350
 
 
 
 
 
 
 
 
 
 
10c7860
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
---
title: Full RAG Assistant
emoji: πŸ’»
colorFrom: purple
colorTo: gray
sdk: gradio
sdk_version: 6.0.1
app_file: app.py
pinned: false
---

πŸ“š RAG AI Assistant β€” Document-Aware Chatbot Powered by Retrieval-Augmented Generation

Welcome to the RAG AI Assistant, a lightweight, open-source document question-answering system that lets you chat with your own knowledge base.

Upload documents β†’ Rebuild the index β†’ Ask questions β†’ Get grounded, explainable answers sourced from your files.

Built with:

FAISS for fast vector search

Sentence Transformers for embeddings

Transformers QA pipeline for extraction

Gradio Blocks UI for clean chat + KB management

πŸš€ Features
🧠 Retrieval-Augmented Generation (RAG)

The assistant retrieves relevant document chunks and uses a QA model to produce accurate, grounded answers.

πŸ“‚ Knowledge Base Uploads

You can upload your own documents directly inside the Space:

Supported formats:

txt, md, pdf, docx, doc

βš™οΈ Rebuildable FAISS Index

After uploading files, click Rebuild index to update the vector store instantly β€” no need to restart the Space.

πŸ’¬ Interactive Chat

Ask free-form questions about your uploaded documents.

The model will:

Retrieve relevant context

Extract answers

Show confidence scores

Cite the source document

πŸ” Full Transparency

If no answer is found, you’ll receive helpful suggestions and context previews.

πŸ› οΈ How It Works
1. Chunking

Documents are split into overlapping chunks:

Size: 500 characters

Overlap: 50 characters

2. Embedding

Each chunk is embedded with:

sentence-transformers/all-MiniLM-L6-v2

3. Vector Search

FAISS (IndexFlatIP) finds the closest matches using cosine similarity.

4. Answer Extraction

A QA model extracts precise answers:

deepset/roberta-base-squad2

πŸ§ͺ Usage Instructions
1. Upload Your Documents

Go to the Knowledge Base tab and upload as many files as you want.

2. Rebuild the Index

Click Rebuild index to process the files and generate embeddings.

3. Start Asking Questions

Switch to the Chat tab and ask questions like:

What is the main topic of the report?
Summarize the key findings.
What does section 3 say about metrics?

πŸ“ Project Structure
β”œβ”€β”€ app.py               # Main application logic
β”œβ”€β”€ config.yaml          # Optional configuration file
β”œβ”€β”€ knowledge_base/      # User-uploaded documents
β”œβ”€β”€ index/               # Saved FAISS index + metadata
└── README.md            # This file

🧩 Tech Stack

Python 3.10

Gradio

FAISS

Sentence Transformers

Transformers (HuggingFace)

PyPDF2 / python-docx

🧱 Roadmap (Upcoming Enhancements)

πŸ”„ Streaming responses (LLM-style typing)

πŸ“Š Document preview inside the chat

πŸ“ Source highlighting of extracted spans

🎨 Theming + cleaner chat UI

⚑ Optional lightweight QA model for faster inference

Feel free to suggest improvements or contribute!