Spaces:
Sleeping
Sleeping
File size: 4,009 Bytes
a7933ea e049915 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 | ---
title: DocuChatDeepSeek
emoji: ⚡
colorFrom: yellow
colorTo: purple
sdk: streamlit
sdk_version: 1.41.1
app_file: app.py
pinned: false
short_description: Deepseek-DocuChat – Simple, intuitive, and descriptive.
---
📄 DocuChat - AI-Powered RAG Chatbot
DocuChat is a Retrieval-Augmented Generation (RAG) chatbot powered by DeepSeek and built with Streamlit. It allows users to upload documents (PDF, Word, Markdown) or provide a web link, process the content, and ask questions about it. The application uses semantic embeddings and a FAISS vector database for efficient retrieval and question-answering.
🚀 Features
Document Upload: Upload PDF, Word (.docx), or Markdown (.md) files.
Web Link Support: Provide a web link to extract and process content.
Semantic Search: Generate embeddings using sentence-transformers for semantic understanding.
Efficient Retrieval: Store embeddings in a FAISS vector database for fast and accurate querying.
Question-Answering: Use DeepSeek API for intelligent question-answering capabilities.
User-Friendly Interface: Built with Streamlit for an interactive and intuitive UI.
🛠️ Installation
Clone the Repository:
git clone https://github.com/your-username/DocuChat.git
cd DocuChat
Install Dependencies:
Make sure you have Python 3.8+ installed. Then, install the required packages:
pip install -r requirements.txt
Set Up DeepSeek API Key:
Obtain your API key from DeepSeek.
Add the API key in the Streamlit app when prompted.
🖥️ Usage
Run the Application:
streamlit run app.py
Input Your DeepSeek API Key:
Enter your API key in the provided field.
Upload a Document or Enter a Web Link:
Choose between uploading a document (PDF, Word, or Markdown) or providing a web link.
Ask Questions:
Once the document is processed, ask questions about its content.
🧩 How It Works
Document Processing:
The uploaded document or web content is split into smaller chunks for efficient processing.
Semantic embeddings are generated using sentence-transformers.
Vector Database:
Embeddings are stored in a FAISS vector database for fast and accurate retrieval.
Question-Answering:
When a user asks a question, the app retrieves the most relevant chunks from the vector database.
The DeepSeek API generates a response based on the retrieved information.
📂 File Structure
Copy
DocuChat/
├── app.py # Main Streamlit application
├── requirements.txt # List of dependencies
├── README.md # Project documentation
└── .gitignore # Files to ignore in Git
📝 Requirements
Python 3.8+
Streamlit
LangChain
FAISS
Sentence-Transformers
PyPDF
Docx2txt
Unstructured (for Markdown files)
WebBaseLoader (for web links)
🔧 Dependencies
Install all dependencies using:
pip install -r requirements.txt
🌟 Why DocuChat?
Efficient: Processes documents once and retrieves answers quickly.
Versatile: Supports multiple file types and web links.
Intelligent: Uses state-of-the-art AI models for semantic understanding and question-answering.
User-Friendly: Simple and intuitive interface powered by Streamlit.
🤝 Contributing
Contributions are welcome! If you'd like to contribute, please follow these steps:
Fork the repository.
Create a new branch (git checkout -b feature/YourFeatureName).
Commit your changes (git commit -m 'Add some feature').
Push to the branch (git push origin feature/YourFeatureName).
Open a pull request.
📜 License
This project is licensed under the MIT License. See the LICENSE file for details.
🙏 Acknowledgments
DeepSeek for providing the question-answering API.
LangChain for the document processing and retrieval framework.
Streamlit for the interactive UI framework.
Sentence-Transformers for semantic embeddings.
📧 Contact
For questions or feedback, feel free to reach out:
sagunchalise@gmail.com
GitHub - https://github.com/schalise
Enjoy using DocuChat! 🎉
Let your documents speak for themselves. 🗣️
|