---
title: DocuChatDeepSeek
emoji: ⚡
colorFrom: yellow
colorTo: purple
sdk: streamlit
sdk_version: 1.41.1
app_file: app.py
pinned: false
short_description: Deepseek-DocuChat – Simple, intuitive, and descriptive.
---

📄 DocuChat - AI-Powered RAG Chatbot
DocuChat is a Retrieval-Augmented Generation (RAG) chatbot powered by DeepSeek and built with Streamlit. It allows users to upload documents (PDF, Word, Markdown) or provide a web link, process the content, and ask questions about it. The application uses semantic embeddings and a FAISS vector database for efficient retrieval and question-answering.

🚀 Features
Document Upload: Upload PDF, Word (.docx), or Markdown (.md) files.

Web Link Support: Provide a web link to extract and process content.

Semantic Search: Generate embeddings using sentence-transformers for semantic understanding.

Efficient Retrieval: Store embeddings in a FAISS vector database for fast and accurate querying.

Question-Answering: Use DeepSeek API for intelligent question-answering capabilities.

User-Friendly Interface: Built with Streamlit for an interactive and intuitive UI.

🛠️ Installation
Clone the Repository:

git clone https://github.com/your-username/DocuChat.git
cd DocuChat

Install Dependencies:
Make sure you have Python 3.8+ installed. Then, install the required packages:

pip install -r requirements.txt
Set Up DeepSeek API Key:

Obtain your API key from DeepSeek.

Add the API key in the Streamlit app when prompted.

🖥️ Usage
Run the Application:

streamlit run app.py
Input Your DeepSeek API Key:

Enter your API key in the provided field.

Upload a Document or Enter a Web Link:

Choose between uploading a document (PDF, Word, or Markdown) or providing a web link.

Ask Questions:

Once the document is processed, ask questions about its content.

🧩 How It Works
Document Processing:

The uploaded document or web content is split into smaller chunks for efficient processing.

Semantic embeddings are generated using sentence-transformers.

Vector Database:

Embeddings are stored in a FAISS vector database for fast and accurate retrieval.

Question-Answering:

When a user asks a question, the app retrieves the most relevant chunks from the vector database.

The DeepSeek API generates a response based on the retrieved information.

📂 File Structure
Copy
DocuChat/
├── app.py                  # Main Streamlit application
├── requirements.txt        # List of dependencies
├── README.md               # Project documentation
└── .gitignore              # Files to ignore in Git


📝 Requirements
Python 3.8+
Streamlit
LangChain
FAISS
Sentence-Transformers
PyPDF
Docx2txt
Unstructured (for Markdown files)
WebBaseLoader (for web links)

🔧 Dependencies
Install all dependencies using:
pip install -r requirements.txt


🌟 Why DocuChat?
Efficient: Processes documents once and retrieves answers quickly.

Versatile: Supports multiple file types and web links.

Intelligent: Uses state-of-the-art AI models for semantic understanding and question-answering.

User-Friendly: Simple and intuitive interface powered by Streamlit.

🤝 Contributing
Contributions are welcome! If you'd like to contribute, please follow these steps:

Fork the repository.

Create a new branch (git checkout -b feature/YourFeatureName).

Commit your changes (git commit -m 'Add some feature').

Push to the branch (git push origin feature/YourFeatureName).

Open a pull request.

📜 License
This project is licensed under the MIT License. See the LICENSE file for details.

🙏 Acknowledgments
DeepSeek for providing the question-answering API.

LangChain for the document processing and retrieval framework.

Streamlit for the interactive UI framework.

Sentence-Transformers for semantic embeddings.

📧 Contact
For questions or feedback, feel free to reach out:

sagunchalise@gmail.com

GitHub - https://github.com/schalise

Enjoy using DocuChat! 🎉
Let your documents speak for themselves. 🗣️