| # `nnul/sqlchat`: A Conversational AI for SQL Generation | |
| This repository contains `sqlchat`, a powerful and efficient language model designed specifically for **Text-to-SQL** tasks. It can understand natural language questions and database schemas to generate accurate SQL queries, including complex statements for creating and managing tables (Data Definition Language). | |
| This model is provided as a standalone 4-bit quantized model, optimized for easy deployment and high-performance, low-resource inference. It was built using the [Unsloth](https://github.com/unslothai/unsloth) library to ensure maximum speed and memory efficiency. | |
| ## Model Capabilities | |
| * **Natural Language to SQL:** Translates complex English questions into executable SQL queries. | |
| * **Schema-Aware:** Understands `CREATE TABLE` contexts provided in the prompt to generate relevant queries. | |
| * **DDL Generation:** Capable of generating `CREATE TABLE` statements, including constraints like `PRIMARY KEY` and `FOREIGN KEY` relationships. | |
| * **Complex Query Logic:** Successfully handles `JOIN`s, aggregations (`COUNT`, `MAX`), and sorting (`ORDER BY ... LIMIT`). | |
| ## How to Use | |
| The easiest way to use `sqlchat` is with the Unsloth library, which will ensure you get the best performance. | |
| ### Prerequisites | |
| First, install the necessary libraries. | |
| ```bash | |
| pip install unsloth | |
| pip install "torch>=2.3.1" | |
| ``` | |
| ### Running Inference | |
| Here is a simple, reusable Python script to run inference with the model. | |
| ```python | |
| import torch | |
| from unsloth import FastLanguageModel | |
| from transformers import TextStreamer | |
| # Load the sqlchat model from the Hugging Face Hub | |
| # This is a standalone 4-bit model, so we load it as such. | |
| print("Loading sqlchat model...") | |
| model, tokenizer = FastLanguageModel.from_pretrained( | |
| model_name="nnul/sqlchat", | |
| max_seq_length=4096, | |
| dtype=None, | |
| load_in_4bit=True, | |
| ) | |
| print("Model loaded successfully.") | |
| # This call optimizes the model for the fastest possible inference. | |
| FastLanguageModel.for_inference(model) | |
| def generate_sql(instruction: str, context: str = ""): | |
| """ | |
| A helper function to generate SQL from a natural language prompt. | |
| """ | |
| prompt = tokenizer.apply_chat_template( | |
| [ | |
| {"role": "system", "content": "You are a helpful assistant that generates SQL queries based on natural language questions and database schemas."}, | |
| {"role": "user", "content": f"### Instruction:\n{instruction}\n\n### Context:\n{context}"}, | |
| ], | |
| tokenize=False, | |
| add_generation_prompt=True, | |
| enable_thinking=False, # Ensures direct SQL output | |
| ) | |
| inputs = tokenizer([prompt], return_tensors="pt").to("cuda") | |
| text_streamer = TextStreamer(tokenizer, skip_prompt=True, clean_up_tokenization_spaces=True) | |
| print(f"User Instruction: {instruction}") | |
| print("\nModel Output:") | |
| print("---------------------------------") | |
| _ = model.generate( | |
| **inputs, | |
| streamer=text_streamer, | |
| max_new_tokens=256, | |
| do_sample=False, # Use greedy decoding for deterministic output | |
| use_cache=True, | |
| ) | |
| print("---------------------------------\n") | |
| # --- Example 1: Querying Data --- | |
| generate_sql( | |
| instruction="Which department has the most number of employees?", | |
| context="CREATE TABLE department (name VARCHAR, num_employees INTEGER)" | |
| ) | |
| # --- Example 2: Creating a Table (DDL) --- | |
| generate_sql( | |
| instruction="We need a table to manage student enrollments in courses. This table should link the 'students' table and the 'courses' table using their respective IDs.", | |
| context=""" | |
| CREATE TABLE students (student_id INTEGER PRIMARY KEY, student_name VARCHAR(255)); | |
| CREATE TABLE courses (course_id INTEGER PRIMARY KEY, course_title VARCHAR(255)); | |
| """ | |
| ) | |
| ``` | |
| ### Expected Output | |
| ``` | |
| User Instruction: Which department has the most number of employees? | |
| Model Output: | |
| --------------------------------- | |
| SELECT name FROM department ORDER BY num_employees DESC LIMIT 1; | |
| --------------------------------- | |
| User Instruction: We need a table to manage student enrollments in courses. This table should link the 'students' table and the 'courses' table using their respective IDs. | |
| Model Output: | |
| --------------------------------- | |
| CREATE TABLE student_enrollment (student_id INTEGER, course_id INTEGER, PRIMARY KEY (student_id, course_id), FOREIGN KEY (student_id) REFERENCES students(student_id), FOREIGN KEY (course_id) REFERENCES courses(course_id)); | |
| --------------------------------- | |
| ``` | |
| ## Performance | |
| The model was benchmarked on an NVIDIA A40 GPU. In a batch-processing scenario, it achieves a throughput of **~55-70 tokens/second**. Single-prompt latency is well within real-time requirements for interactive applications. | |
| * **Peak VRAM Usage (Inference):** ~6.1 GB | |
| ## Prompt Template | |
| To get the best results, your prompts should follow this structure: | |
| ``` | |
| <|im_start|>system | |
| You are a helpful assistant that generates SQL queries based on natural language questions and database schemas.<|im_end|> | |
| <|im_start|>user | |
| ### Instruction: | |
| {Your natural language question} | |
| ### Context: | |
| {The CREATE TABLE statements for the relevant tables}<|im_end|> | |
| <|im_start|>assistant | |
| ``` |