QueryMind / README.md
7beshoyarnest's picture
Update README.md
c913d01 verified
|
Raw
History Blame Contribute Delete
4.81 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade
metadata
title: QueryMind
emoji: 🧠
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.28.0
app_file: src/app.py
pinned: false

AgentGraph: Intelligent SQL-agent Q&A and RAG System for Chatting with Multiple Databases

This project demonstrates how to build an agentic system using Large Language Models (LLMs) that can interact with multiple databases and utilize various tools. It highlights the use of SQL agents to efficiently query large databases. The key frameworks used in this project include GroqAI, LangChain, LangGraph, LangSmith, and Gradio. The end product is an end-to-end chatbot, designed to perform these tasks, with LangSmith used to monitor the performance of the agents.


Video Explanation:

A detailed explanation of the project is available in the following YouTube video:

Automating LLM Agents to Chat with Multiple/Large Databases (Combining RAG and SQL Agents): Link


Requirements

  • Operating System: Linux or Windows (Tested on Windows 11 with Python 3.9.11)
  • Groq API Key: Required for GPT functionality.
  • Tavily Credentials: Required for search tools (Free from your Tavily profile).
  • LangChain Credentials: Required for LangSmith (Free from your LangChain profile).
  • Dependencies: The necessary libraries are provided in requirements.txt file.

Installation and Execution

To set up the project, follow these steps:

  1. Clone the repository:

    git clone <repo_address>
    
  2. Install Python and create a virtual environment:

    python -m venv venv
    
  3. Activate the virtual environment:

    • On Windows:
      venv\Scripts\activate
      
    • On Linux/macOS:
      source venv/bin/activate
      
  4. Install the required dependencies:

    pip install -r requirements.txt
    
  5. Download the travel sql database from this link and paste it into the data folder.

  6. Download the chinook SQL database from this link and paste it into the data folder.

  7. Prepare the .env file and add your GROQ_API_KEY, TAVILY_API_KEY, and LANGCHAIN_API_KEY.

  8. Run prepare_vector_db.py module once to prepare both vector databases.

    python src\prepare_vector_db.py
    
  9. Run the app:

    python src\app.py
    

Open the Gradio URL generated in the terminal and start chatting.

Sample questions are available in sample_questions.txt.


Using Your Own Database

To use your own data:

  1. Place your data in the data folder.
  2. Update the configurations in tools_config.yml.
  3. Load the configurations in src\agent_graph\load_tools_config.py.

For unstructured data using Retrieval-Augmented Generation (RAG):

  1. Run the following command with your data directory's configuration:
    python src\prepare_vector_db.py
    

All configurations are managed through YAML files in the configs folder, loaded by src\chatbot\load_config.py and src\agent_graph\load_tools_config.py. These modules are used for a clean distribution of configurations throughout the project.

Once your databases are ready, you can either connect the current agents to the databases or create new agents. More details can be found in the accompanying YouTube video.


Project Schemas

High-level overview

high-level

Detailed Schema

detailed_schema

Graph Schema

graph_image

SQL-agent for large databases strategies

large_db_strategy

Chatbot User Interface

ChatBot UI

LangSmith Monitoring System

langsmith

Databases Used


Key Frameworks and Libraries


Docker

Build the image

docker build -t querymind .

Run the container

docker run -p 7860:7860 \
  -e GROQ_API_KEY=your_key \
  -e TAVILY_API_KEY=your_key \
  -e LANGCHAIN_API_KEY=your_key \
  querymind