--- title: QueryMind emoji: 🧠 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 5.28.0 app_file: src/app.py pinned: false --- --- # AgentGraph: Intelligent SQL-agent Q&A and RAG System for Chatting with Multiple Databases This project demonstrates how to build an agentic system using Large Language Models (LLMs) that can interact with multiple databases and utilize various tools. It highlights the use of SQL agents to efficiently query large databases. The key frameworks used in this project include GroqAI, LangChain, LangGraph, LangSmith, and Gradio. The end product is an end-to-end chatbot, designed to perform these tasks, with LangSmith used to monitor the performance of the agents. --- ## Video Explanation: A detailed explanation of the project is available in the following YouTube video: Automating LLM Agents to Chat with Multiple/Large Databases (Combining RAG and SQL Agents): [Link](https://youtu.be/xsCedrNP9w8?si=v-3k-BoDky_1IRsg) --- ## Requirements - **Operating System:** Linux or Windows (Tested on Windows 11 with Python 3.9.11) - **Groq API Key:** Required for GPT functionality. - **Tavily Credentials:** Required for search tools (Free from your Tavily profile). - **LangChain Credentials:** Required for LangSmith (Free from your LangChain profile). - **Dependencies:** The necessary libraries are provided in `requirements.txt` file. --- ## Installation and Execution To set up the project, follow these steps: 1. Clone the repository: ```bash git clone ``` 2. Install Python and create a virtual environment: ```bash python -m venv venv ``` 3. Activate the virtual environment: - On Windows: ```bash venv\Scripts\activate ``` - On Linux/macOS: ```bash source venv/bin/activate ``` 4. Install the required dependencies: ```bash pip install -r requirements.txt ``` 5. Download the travel sql database from this link and paste it into the `data` folder. 6. Download the chinook SQL database from this link and paste it into the `data` folder. 7. Prepare the `.env` file and add your `GROQ_API_KEY`, `TAVILY_API_KEY`, and `LANGCHAIN_API_KEY`. 8. Run `prepare_vector_db.py` module once to prepare both vector databases. ```bash python src\prepare_vector_db.py ``` 9. Run the app: ```bash python src\app.py ``` Open the Gradio URL generated in the terminal and start chatting. *Sample questions are available in `sample_questions.txt`.* --- ### Using Your Own Database To use your own data: 1. Place your data in the `data` folder. 2. Update the configurations in `tools_config.yml`. 3. Load the configurations in `src\agent_graph\load_tools_config.py`. For unstructured data using Retrieval-Augmented Generation (RAG): 1. Run the following command with your data directory's configuration: ```bash python src\prepare_vector_db.py ``` All configurations are managed through YAML files in the `configs` folder, loaded by `src\chatbot\load_config.py` and `src\agent_graph\load_tools_config.py`. These modules are used for a clean distribution of configurations throughout the project. Once your databases are ready, you can either connect the current agents to the databases or create new agents. More details can be found in the accompanying YouTube video. --- ## Project Schemas ### High-level overview
high-level
### Detailed Schema
detailed_schema
### Graph Schema
graph_image
### SQL-agent for large databases strategies
large_db_strategy
--- ## Chatbot User Interface
ChatBot UI
--- ## LangSmith Monitoring System
langsmith
--- ## Databases Used - **Travel SQL Database:** [Kaggle Link](https://www.kaggle.com/code/mpwolke/airlines-sqlite) - **Chinook SQL Database:** [Sample Database](https://database.guide/2-sample-databases-sqlite/) - **stories VectorDB** - **Airline Policy FAQ VectorDB** --- ## Key Frameworks and Libraries - **LangChain:** [Introduction](https://python.langchain.com/docs/get_started/introduction) - **LangGraph** - **LangSmith** - **Gradio:** [Documentation](https://www.gradio.app/docs/interface) - **GroqAI** - **Tavily Search** --- ## Docker ### Build the image ```bash docker build -t querymind . ``` ### Run the container ```bash docker run -p 7860:7860 \ -e GROQ_API_KEY=your_key \ -e TAVILY_API_KEY=your_key \ -e LANGCHAIN_API_KEY=your_key \ querymind ```