Spaces:
Sleeping
Sleeping
| title: DocuChatDeepSeek | |
| emoji: ⚡ | |
| colorFrom: yellow | |
| colorTo: purple | |
| sdk: streamlit | |
| sdk_version: 1.41.1 | |
| app_file: app.py | |
| pinned: false | |
| short_description: Deepseek-DocuChat – Simple, intuitive, and descriptive. | |
| 📄 DocuChat - AI-Powered RAG Chatbot | |
| DocuChat is a Retrieval-Augmented Generation (RAG) chatbot powered by DeepSeek and built with Streamlit. It allows users to upload documents (PDF, Word, Markdown) or provide a web link, process the content, and ask questions about it. The application uses semantic embeddings and a FAISS vector database for efficient retrieval and question-answering. | |
| 🚀 Features | |
| Document Upload: Upload PDF, Word (.docx), or Markdown (.md) files. | |
| Web Link Support: Provide a web link to extract and process content. | |
| Semantic Search: Generate embeddings using sentence-transformers for semantic understanding. | |
| Efficient Retrieval: Store embeddings in a FAISS vector database for fast and accurate querying. | |
| Question-Answering: Use DeepSeek API for intelligent question-answering capabilities. | |
| User-Friendly Interface: Built with Streamlit for an interactive and intuitive UI. | |
| 🛠️ Installation | |
| Clone the Repository: | |
| git clone https://github.com/your-username/DocuChat.git | |
| cd DocuChat | |
| Install Dependencies: | |
| Make sure you have Python 3.8+ installed. Then, install the required packages: | |
| pip install -r requirements.txt | |
| Set Up DeepSeek API Key: | |
| Obtain your API key from DeepSeek. | |
| Add the API key in the Streamlit app when prompted. | |
| 🖥️ Usage | |
| Run the Application: | |
| streamlit run app.py | |
| Input Your DeepSeek API Key: | |
| Enter your API key in the provided field. | |
| Upload a Document or Enter a Web Link: | |
| Choose between uploading a document (PDF, Word, or Markdown) or providing a web link. | |
| Ask Questions: | |
| Once the document is processed, ask questions about its content. | |
| 🧩 How It Works | |
| Document Processing: | |
| The uploaded document or web content is split into smaller chunks for efficient processing. | |
| Semantic embeddings are generated using sentence-transformers. | |
| Vector Database: | |
| Embeddings are stored in a FAISS vector database for fast and accurate retrieval. | |
| Question-Answering: | |
| When a user asks a question, the app retrieves the most relevant chunks from the vector database. | |
| The DeepSeek API generates a response based on the retrieved information. | |
| 📂 File Structure | |
| Copy | |
| DocuChat/ | |
| ├── app.py # Main Streamlit application | |
| ├── requirements.txt # List of dependencies | |
| ├── README.md # Project documentation | |
| └── .gitignore # Files to ignore in Git | |
| 📝 Requirements | |
| Python 3.8+ | |
| Streamlit | |
| LangChain | |
| FAISS | |
| Sentence-Transformers | |
| PyPDF | |
| Docx2txt | |
| Unstructured (for Markdown files) | |
| WebBaseLoader (for web links) | |
| 🔧 Dependencies | |
| Install all dependencies using: | |
| pip install -r requirements.txt | |
| 🌟 Why DocuChat? | |
| Efficient: Processes documents once and retrieves answers quickly. | |
| Versatile: Supports multiple file types and web links. | |
| Intelligent: Uses state-of-the-art AI models for semantic understanding and question-answering. | |
| User-Friendly: Simple and intuitive interface powered by Streamlit. | |
| 🤝 Contributing | |
| Contributions are welcome! If you'd like to contribute, please follow these steps: | |
| Fork the repository. | |
| Create a new branch (git checkout -b feature/YourFeatureName). | |
| Commit your changes (git commit -m 'Add some feature'). | |
| Push to the branch (git push origin feature/YourFeatureName). | |
| Open a pull request. | |
| 📜 License | |
| This project is licensed under the MIT License. See the LICENSE file for details. | |
| 🙏 Acknowledgments | |
| DeepSeek for providing the question-answering API. | |
| LangChain for the document processing and retrieval framework. | |
| Streamlit for the interactive UI framework. | |
| Sentence-Transformers for semantic embeddings. | |
| 📧 Contact | |
| For questions or feedback, feel free to reach out: | |
| sagunchalise@gmail.com | |
| GitHub - https://github.com/schalise | |
| Enjoy using DocuChat! 🎉 | |
| Let your documents speak for themselves. 🗣️ | |