Spaces:
Sleeping
Sleeping
| license: apache-2.0 | |
| title: Multi-Model-Rag | |
| sdk: streamlit | |
| emoji: π | |
| colorFrom: gray | |
| colorTo: indigo | |
| ## π Multi-Modal RAG PDF Chatbot | |
| A Streamlit application that allows you to **upload a PDF**, ask questions about its content, and get accurate responses using a **Multi-Modal Retrieval-Augmented Generation (RAG)** pipeline powered by **Groq's Gemma-2 9B model**. | |
| --- | |
| ### π Features | |
| - π Upload any PDF | |
| - π Intelligent chunking and embedding | |
| - π§ Ask natural language questions about your PDF | |
| - β‘ Powered by FAISS + HuggingFace + Groq LLM | |
| - π§ Caches session so PDF isn't reprocessed on every query | |
| --- | |
| ### π οΈ Installation (with `venv`) | |
| 1. **Clone the repo:** | |
| ```bash | |
| git clone https://github.com/Warishayat/Multimodel-Rag-Application01.git | |
| cd Multimodal-Rag-Application01 | |
| ``` | |
| 2. **Create and activate a virtual environment:** | |
| ```bash | |
| python -m venv venv | |
| # Activate: | |
| # On Windows | |
| venv\Scripts\activate | |
| # On macOS/Linux | |
| source venv/bin/activate | |
| ``` | |
| 3. **Install dependencies:** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 4. **Set up your `.env` file:** | |
| Create a `.env` file in the root directory: | |
| ``` | |
| GROQ_API_KEY=your_groq_api_key_here | |
| ``` | |
| --- | |
| ### π¦ Project Structure | |
| ``` | |
| π Multimodal-Rag-Application01 | |
| βββ main.py # Streamlit frontend | |
| βββ pdfparsing.py # PDF parser using pymupdf4llm | |
| βββ Datapreprocessing.py # Chunking & text cleaning | |
| βββ vectorstore.py # Embedding & FAISS logic | |
| βββ .env # API keys | |
| βββ requirements.txt # Python dependencies | |
| βββ README.md # You're here! | |
| ``` | |
| --- | |
| ### βΆοΈ Run the App | |
| ```bash | |
| streamlit run main.py | |
| ``` | |
| Then open `http://localhost:8501` in your browser. | |
| --- | |
| ### π§ͺ Example Queries | |
| After uploading a PDF, try asking: | |
| - "What is the summary of section 3?" | |
| - "List all benchmarks mentioned." | |
| - "How is this model different from others?" | |
| --- | |
| ### π‘ Tips | |
| - PDF is processed only once per session using `st.session_state`. | |
| - Uses `RecursiveCharacterTextSplitter` for effective chunking. | |
| - Embedding with `HuggingFaceEmbeddings`. | |
| --- | |
| ### π Requirements | |
| Make sure your `requirements.txt` includes at least: | |
| ```txt | |
| streamlit | |
| python-dotenv | |
| langchain | |
| langchain-community | |
| langchain-groq | |
| faiss-cpu | |
| pymupdf4llm | |
| ``` | |
| --- | |
| ### π¬ Credits | |
| Built with β€οΈ by Waris Hayat Abbasi. | |
| --- |