sourize
commited on
Commit
Β·
be598b9
1
Parent(s):
24111c8
Updated main.py
Browse files
README.md
CHANGED
|
@@ -1,42 +1,47 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 2 |
|
| 3 |
-
|
| 4 |
|
| 5 |
-
|
| 6 |
|
| 7 |
-
|
| 8 |
-
- Chatbot interface with persistent chat history
|
| 9 |
-
- Document-specific Q&A using Retrieval-Augmented Generation (RAG)
|
| 10 |
-
- General knowledge support (open-domain QA)
|
| 11 |
-
- Anti-hallucination strategies
|
| 12 |
-
- Runs fully locally using Hugging Face models
|
| 13 |
|
| 14 |
-
|
| 15 |
|
| 16 |
-
|
| 17 |
-
- `PyMuPDF` - To extract text from PDF files
|
| 18 |
-
- `sentence-transformers` - For vector embeddings
|
| 19 |
-
- `FAISS` - For fast similarity search
|
| 20 |
-
- `transformers` - Lightweight Hugging Face inference with distilBERT
|
| 21 |
-
- `Python` - Core application logic
|
| 22 |
|
| 23 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
| 24 |
|
| 25 |
-
|
| 26 |
-
2. PDF is chunked and embedded
|
| 27 |
-
3. FAISS builds an index of these chunks
|
| 28 |
-
4. Ask any question:
|
| 29 |
-
- For general questions: uses general knowledge
|
| 30 |
-
- For doc-specific: retrieves relevant context and answers accordingly
|
| 31 |
|
| 32 |
-
|
|
|
|
|
|
|
| 33 |
|
| 34 |
-
|
| 35 |
-
|
| 36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 37 |
|
| 38 |
-
##
|
| 39 |
|
| 40 |
```bash
|
| 41 |
pip install streamlit faiss-cpu sentence-transformers transformers PyMuPDF
|
| 42 |
-
streamlit run app.py
|
|
|
|
| 1 |
+
---
|
| 2 |
+
title: RagBot
|
| 3 |
+
emoji: π
|
| 4 |
+
colorFrom: indigo
|
| 5 |
+
colorTo: blue
|
| 6 |
+
sdk: streamlit
|
| 7 |
+
sdk_version: "1.32.0"
|
| 8 |
+
app_file: app.py
|
| 9 |
+
pinned: false
|
| 10 |
+
---
|
| 11 |
|
| 12 |
+
Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
|
| 13 |
|
| 14 |
+
---
|
| 15 |
|
| 16 |
+
# π RagBot: Chatbot + Document QA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 17 |
|
| 18 |
+
This app lets you chat with any uploaded PDF and ask questions β either from the document or general ones β powered by **local open-source models**. Combines RAG with chatbot UX.
|
| 19 |
|
| 20 |
+
## π Features
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
+
- Upload PDFs and chat with them
|
| 23 |
+
- General knowledge + document-specific answers
|
| 24 |
+
- Accurate retrieval using FAISS and sentence-transformers
|
| 25 |
+
- Clean, sticky UI with full chat history
|
| 26 |
+
- Runs 100% free using lightweight Hugging Face models
|
| 27 |
|
| 28 |
+
## π§ Powered By
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 29 |
|
| 30 |
+
- π§ Embeddings: `sentence-transformers/paraphrase-MiniLM-L6-v2`
|
| 31 |
+
- π QA Model: `distilbert-base-cased-distilled-squad`
|
| 32 |
+
- π§° Backend: Streamlit + FAISS + Transformers
|
| 33 |
|
| 34 |
+
## π‘ How It Works
|
| 35 |
+
|
| 36 |
+
1. Upload your PDF
|
| 37 |
+
2. App chunks & embeds the text
|
| 38 |
+
3. Uses FAISS to retrieve context
|
| 39 |
+
4. Hugging Face model generates answers:
|
| 40 |
+
- Uses the doc context if available
|
| 41 |
+
- Falls back to general QA
|
| 42 |
+
- Refuses to answer if uncertain
|
| 43 |
|
| 44 |
+
## π¦ Dependencies
|
| 45 |
|
| 46 |
```bash
|
| 47 |
pip install streamlit faiss-cpu sentence-transformers transformers PyMuPDF
|
|
|