Spaces:
Sleeping
Sleeping
| title: Deepseek RAG Chat Bot | |
| emoji: π | |
| colorFrom: red | |
| colorTo: pink | |
| sdk: streamlit | |
| sdk_version: 1.41.1 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Deepseek-RAG-Chat-Bot | |
| # RAG-Powered Chatbot with Streamlit | |
| This project is a Retrieval-Augmented Generation (RAG) chatbot built using Streamlit. It allows users to upload a PDF document, process it, and ask questions about its content. The application efficiently processes the document once and uses vector-based retrieval to answer queries. | |
| --- | |
| ## Features | |
| - Upload PDF documents and process them into chunks for efficient querying. | |
| - Generate semantic embeddings using `sentence-transformers`. | |
| - Store embeddings in a `FAISS` vector database for efficient retrieval. | |
| - Use the `DeepSeek` API for question-answering capabilities. | |
| - Built with Streamlit for an interactive and user-friendly UI. | |
| --- | |
| ## Requirements | |
| - Python 3.8 or higher | |
| ### Dependencies | |
| Install the required Python libraries: | |
| ```plaintext | |
| streamlit==1.25.0 | |
| langchain==0.81.0 | |
| langchain-community==0.1.2 | |
| faiss-cpu==1.7.4 | |
| sentence-transformers==2.2.2 | |
| pypdf==3.8.1 | |
| ``` | |
| To install all dependencies: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| --- | |
| ## Setup and Usage | |
| ### 1. Clone the Repository | |
| ```bash | |
| git clone https://github.com/your-username/rag-chatbot.git | |
| cd rag-chatbot | |
| ``` | |
| ### 2. Install Dependencies | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| ### 3. Run the Application | |
| Run the Streamlit application: | |
| ```bash | |
| streamlit run app.py | |
| ``` | |
| ### 4. Interact with the Chatbot | |
| 1. Enter your `DeepSeek API Key` in the provided input field. | |
| 2. Upload a PDF document. | |
| 3. Ask questions about the content of the document. | |
| --- | |
| ## Project Structure | |
| ```plaintext | |
| . | |
| βββ app.py # Main application code | |
| βββ requirements.txt # List of dependencies | |
| βββ README.md # Documentation | |
| ``` | |
| --- | |
| ## Key Technologies Used | |
| 1. **Streamlit**: | |
| - For building a user-friendly web interface. | |
| 2. **LangChain**: | |
| - For document loading, text splitting, and RAG pipeline. | |
| 3. **FAISS**: | |
| - For storing and querying vector embeddings. | |
| 4. **Sentence Transformers**: | |
| - For generating semantic embeddings of text chunks. | |
| 5. **PyPDF**: | |
| - For parsing PDF files. | |
| 6. **DeepSeek API**: | |
| - For question-answering capabilities. | |
| --- | |
| ## How It Works | |
| 1. **PDF Upload**: | |
| - The user uploads a PDF document. | |
| - The document is split into manageable text chunks. | |
| 2. **Embeddings Generation**: | |
| - Semantic embeddings are generated using `sentence-transformers`. | |
| 3. **Vector Storage**: | |
| - The embeddings are stored in a `FAISS` vector database for efficient retrieval. | |
| 4. **Question Answering**: | |
| - The user asks a question about the uploaded document. | |
| - The RAG pipeline retrieves relevant chunks and generates a response using the `DeepSeek` API. | |
| --- | |
| ## Troubleshooting | |
| - **Error: `pypdf package not found`** | |
| Ensure `pypdf` is installed. Run: | |
| ```bash | |
| pip install pypdf | |
| ``` | |
| - **Error: `langchain-community module not found`** | |
| Ensure `langchain-community` is installed. Run: | |
| ```bash | |
| pip install langchain-community | |
| ``` | |
| - **Reprocessing PDF on Every Query** | |
| This issue is resolved by using `st.session_state` to persist the processed `vector_store`. | |
| --- | |
| ## Future Improvements | |
| 1. Add support for multiple file uploads. | |
| 2. Integrate additional language models. | |
| 3. Enhance the UI with better visualization of document content. | |
| 4. Add support for other document formats (e.g., Word, TXT). | |
| --- | |
| ## License | |
| This project is licensed under the MIT License. See the `LICENSE` file for more details. | |
| --- | |
| ## Contributions | |
| Contributions are welcome! Feel free to fork the repository and submit a pull request. | |
| --- | |
| ## Contact | |
| For any queries or support, please contact: | |
| - Name: [Sagun Chalise] | |
| - Email: [sagunchalise@gmail.com] | |
| --- | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference |