Spaces:
Sleeping
A newer version of the Gradio SDK is available: 6.19.0
title: QueryMind
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.28.0
app_file: src/app.py
pinned: false
AgentGraph: Intelligent SQL-agent Q&A and RAG System for Chatting with Multiple Databases
This project demonstrates how to build an agentic system using Large Language Models (LLMs) that can interact with multiple databases and utilize various tools. It highlights the use of SQL agents to efficiently query large databases. The key frameworks used in this project include GroqAI, LangChain, LangGraph, LangSmith, and Gradio. The end product is an end-to-end chatbot, designed to perform these tasks, with LangSmith used to monitor the performance of the agents.
Video Explanation:
A detailed explanation of the project is available in the following YouTube video:
Automating LLM Agents to Chat with Multiple/Large Databases (Combining RAG and SQL Agents): Link
Requirements
- Operating System: Linux or Windows (Tested on Windows 11 with Python 3.9.11)
- Groq API Key: Required for GPT functionality.
- Tavily Credentials: Required for search tools (Free from your Tavily profile).
- LangChain Credentials: Required for LangSmith (Free from your LangChain profile).
- Dependencies: The necessary libraries are provided in
requirements.txtfile.
Installation and Execution
To set up the project, follow these steps:
Clone the repository:
git clone <repo_address>Install Python and create a virtual environment:
python -m venv venvActivate the virtual environment:
- On Windows:
venv\Scripts\activate - On Linux/macOS:
source venv/bin/activate
- On Windows:
Install the required dependencies:
pip install -r requirements.txtDownload the travel sql database from this link and paste it into the
datafolder.Download the chinook SQL database from this link and paste it into the
datafolder.Prepare the
.envfile and add yourGROQ_API_KEY,TAVILY_API_KEY, andLANGCHAIN_API_KEY.Run
prepare_vector_db.pymodule once to prepare both vector databases.python src\prepare_vector_db.pyRun the app:
python src\app.py
Open the Gradio URL generated in the terminal and start chatting.
Sample questions are available in sample_questions.txt.
Using Your Own Database
To use your own data:
- Place your data in the
datafolder. - Update the configurations in
tools_config.yml. - Load the configurations in
src\agent_graph\load_tools_config.py.
For unstructured data using Retrieval-Augmented Generation (RAG):
- Run the following command with your data directory's configuration:
python src\prepare_vector_db.py
All configurations are managed through YAML files in the configs folder, loaded by src\chatbot\load_config.py and src\agent_graph\load_tools_config.py. These modules are used for a clean distribution of configurations throughout the project.
Once your databases are ready, you can either connect the current agents to the databases or create new agents. More details can be found in the accompanying YouTube video.
Project Schemas
High-level overview
Detailed Schema
Graph Schema
SQL-agent for large databases strategies
Chatbot User Interface
LangSmith Monitoring System
Databases Used
- Travel SQL Database: Kaggle Link
- Chinook SQL Database: Sample Database
- stories VectorDB
- Airline Policy FAQ VectorDB
Key Frameworks and Libraries
- LangChain: Introduction
- LangGraph
- LangSmith
- Gradio: Documentation
- GroqAI
- Tavily Search
Docker
Build the image
docker build -t querymind .
Run the container
docker run -p 7860:7860 \
-e GROQ_API_KEY=your_key \
-e TAVILY_API_KEY=your_key \
-e LANGCHAIN_API_KEY=your_key \
querymind