`nnul/sqlchat`: A Conversational AI for SQL Generation

This repository contains sqlchat, a powerful and efficient language model designed specifically for Text-to-SQL tasks. It can understand natural language questions and database schemas to generate accurate SQL queries, including complex statements for creating and managing tables (Data Definition Language).

This model is provided as a standalone 4-bit quantized model, optimized for easy deployment and high-performance, low-resource inference. It was built using the Unsloth library to ensure maximum speed and memory efficiency.

Model Capabilities

Natural Language to SQL: Translates complex English questions into executable SQL queries.
Schema-Aware: Understands CREATE TABLE contexts provided in the prompt to generate relevant queries.
DDL Generation: Capable of generating CREATE TABLE statements, including constraints like PRIMARY KEY and FOREIGN KEY relationships.
Complex Query Logic: Successfully handles JOINs, aggregations (COUNT, MAX), and sorting (ORDER BY ... LIMIT).

How to Use

The easiest way to use sqlchat is with the Unsloth library, which will ensure you get the best performance.

Prerequisites

First, install the necessary libraries.

pip install unsloth
pip install "torch>=2.3.1"

Running Inference

Here is a simple, reusable Python script to run inference with the model.

import torch
from unsloth import FastLanguageModel
from transformers import TextStreamer

# Load the sqlchat model from the Hugging Face Hub
# This is a standalone 4-bit model, so we load it as such.
print("Loading sqlchat model...")
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="nnul/sqlchat",
    max_seq_length=4096,
    dtype=None,
    load_in_4bit=True,
)
print("Model loaded successfully.")

# This call optimizes the model for the fastest possible inference.
FastLanguageModel.for_inference(model)

def generate_sql(instruction: str, context: str = ""):
    """
    A helper function to generate SQL from a natural language prompt.
    """
    prompt = tokenizer.apply_chat_template(
        [
            {"role": "system", "content": "You are a helpful assistant that generates SQL queries based on natural language questions and database schemas."},
            {"role": "user", "content": f"### Instruction:\n{instruction}\n\n### Context:\n{context}"},
        ],
        tokenize=False,
        add_generation_prompt=True,
        enable_thinking=False, # Ensures direct SQL output
    )

    inputs = tokenizer([prompt], return_tensors="pt").to("cuda")
    text_streamer = TextStreamer(tokenizer, skip_prompt=True, clean_up_tokenization_spaces=True)

    print(f"User Instruction: {instruction}")
    print("\nModel Output:")
    print("---------------------------------")
    _ = model.generate(
        **inputs,
        streamer=text_streamer,
        max_new_tokens=256,
        do_sample=False, # Use greedy decoding for deterministic output
        use_cache=True,
    )
    print("---------------------------------\n")

# --- Example 1: Querying Data ---
generate_sql(
    instruction="Which department has the most number of employees?",
    context="CREATE TABLE department (name VARCHAR, num_employees INTEGER)"
)

# --- Example 2: Creating a Table (DDL) ---
generate_sql(
    instruction="We need a table to manage student enrollments in courses. This table should link the 'students' table and the 'courses' table using their respective IDs.",
    context="""
CREATE TABLE students (student_id INTEGER PRIMARY KEY, student_name VARCHAR(255));
CREATE TABLE courses (course_id INTEGER PRIMARY KEY, course_title VARCHAR(255));
"""
)

Expected Output

User Instruction: Which department has the most number of employees?

Model Output:
---------------------------------
SELECT name FROM department ORDER BY num_employees DESC LIMIT 1;
---------------------------------

User Instruction: We need a table to manage student enrollments in courses. This table should link the 'students' table and the 'courses' table using their respective IDs.

Model Output:
---------------------------------
CREATE TABLE student_enrollment (student_id INTEGER, course_id INTEGER, PRIMARY KEY (student_id, course_id), FOREIGN KEY (student_id) REFERENCES students(student_id), FOREIGN KEY (course_id) REFERENCES courses(course_id));
---------------------------------

Performance

The model was benchmarked on an NVIDIA A40 GPU. In a batch-processing scenario, it achieves a throughput of ~55-70 tokens/second. Single-prompt latency is well within real-time requirements for interactive applications.

Peak VRAM Usage (Inference): ~6.1 GB

Prompt Template

To get the best results, your prompts should follow this structure:

<|im_start|>system
You are a helpful assistant that generates SQL queries based on natural language questions and database schemas.<|im_end|>
<|im_start|>user
### Instruction:
{Your natural language question}

### Context:
{The CREATE TABLE statements for the relevant tables}<|im_end|>
<|im_start|>assistant

Downloads last month: 1

Safetensors

Model size

2B params

Tensor type

F32

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

nnul/sqlchat: A Conversational AI for SQL Generation