# `nnul/sqlchat`: A Conversational AI for SQL Generation This repository contains `sqlchat`, a powerful and efficient language model designed specifically for **Text-to-SQL** tasks. It can understand natural language questions and database schemas to generate accurate SQL queries, including complex statements for creating and managing tables (Data Definition Language). This model is provided as a standalone 4-bit quantized model, optimized for easy deployment and high-performance, low-resource inference. It was built using the [Unsloth](https://github.com/unslothai/unsloth) library to ensure maximum speed and memory efficiency. ## Model Capabilities * **Natural Language to SQL:** Translates complex English questions into executable SQL queries. * **Schema-Aware:** Understands `CREATE TABLE` contexts provided in the prompt to generate relevant queries. * **DDL Generation:** Capable of generating `CREATE TABLE` statements, including constraints like `PRIMARY KEY` and `FOREIGN KEY` relationships. * **Complex Query Logic:** Successfully handles `JOIN`s, aggregations (`COUNT`, `MAX`), and sorting (`ORDER BY ... LIMIT`). ## How to Use The easiest way to use `sqlchat` is with the Unsloth library, which will ensure you get the best performance. ### Prerequisites First, install the necessary libraries. ```bash pip install unsloth pip install "torch>=2.3.1" ``` ### Running Inference Here is a simple, reusable Python script to run inference with the model. ```python import torch from unsloth import FastLanguageModel from transformers import TextStreamer # Load the sqlchat model from the Hugging Face Hub # This is a standalone 4-bit model, so we load it as such. print("Loading sqlchat model...") model, tokenizer = FastLanguageModel.from_pretrained( model_name="nnul/sqlchat", max_seq_length=4096, dtype=None, load_in_4bit=True, ) print("Model loaded successfully.") # This call optimizes the model for the fastest possible inference. FastLanguageModel.for_inference(model) def generate_sql(instruction: str, context: str = ""): """ A helper function to generate SQL from a natural language prompt. """ prompt = tokenizer.apply_chat_template( [ {"role": "system", "content": "You are a helpful assistant that generates SQL queries based on natural language questions and database schemas."}, {"role": "user", "content": f"### Instruction:\n{instruction}\n\n### Context:\n{context}"}, ], tokenize=False, add_generation_prompt=True, enable_thinking=False, # Ensures direct SQL output ) inputs = tokenizer([prompt], return_tensors="pt").to("cuda") text_streamer = TextStreamer(tokenizer, skip_prompt=True, clean_up_tokenization_spaces=True) print(f"User Instruction: {instruction}") print("\nModel Output:") print("---------------------------------") _ = model.generate( **inputs, streamer=text_streamer, max_new_tokens=256, do_sample=False, # Use greedy decoding for deterministic output use_cache=True, ) print("---------------------------------\n") # --- Example 1: Querying Data --- generate_sql( instruction="Which department has the most number of employees?", context="CREATE TABLE department (name VARCHAR, num_employees INTEGER)" ) # --- Example 2: Creating a Table (DDL) --- generate_sql( instruction="We need a table to manage student enrollments in courses. This table should link the 'students' table and the 'courses' table using their respective IDs.", context=""" CREATE TABLE students (student_id INTEGER PRIMARY KEY, student_name VARCHAR(255)); CREATE TABLE courses (course_id INTEGER PRIMARY KEY, course_title VARCHAR(255)); """ ) ``` ### Expected Output ``` User Instruction: Which department has the most number of employees? Model Output: --------------------------------- SELECT name FROM department ORDER BY num_employees DESC LIMIT 1; --------------------------------- User Instruction: We need a table to manage student enrollments in courses. This table should link the 'students' table and the 'courses' table using their respective IDs. Model Output: --------------------------------- CREATE TABLE student_enrollment (student_id INTEGER, course_id INTEGER, PRIMARY KEY (student_id, course_id), FOREIGN KEY (student_id) REFERENCES students(student_id), FOREIGN KEY (course_id) REFERENCES courses(course_id)); --------------------------------- ``` ## Performance The model was benchmarked on an NVIDIA A40 GPU. In a batch-processing scenario, it achieves a throughput of **~55-70 tokens/second**. Single-prompt latency is well within real-time requirements for interactive applications. * **Peak VRAM Usage (Inference):** ~6.1 GB ## Prompt Template To get the best results, your prompts should follow this structure: ``` <|im_start|>system You are a helpful assistant that generates SQL queries based on natural language questions and database schemas.<|im_end|> <|im_start|>user ### Instruction: {Your natural language question} ### Context: {The CREATE TABLE statements for the relevant tables}<|im_end|> <|im_start|>assistant ```